microsoft ai represents llava-med : Effectively trained large language and visual assistant, a revolutionizing biomedical search, providing advanced multimodal conversations in less than 15 hours.
A team of researchers from Microsoft AI proposed a new method of teaching a visual-language spoken assistant who can answer open questions about biomedical images. Their approach is based on the use of a large set of data with biomedician drawings and signatures extracted from the Pubmed Central and GPT-4 for self-instruction of open data on the implementation of instructions from signatures.
The model imitates a gradual process by which the non-specialist acquires biological knowledge, first learning to align Biomedical vocabulary with the help of pairs of a signature pattern, and then learning to master open conversational semantics using instructions generated by GPT-4.
The result works was LLAVA-MED: a large language and visual assistant for biomedicine, which can communicate in multimodal mode and perform free instructions . Llava-Med is well suited for answers to questions related to biological images.
After the training of LLAVA-Med, it shows the best results on three reference data sets on the biomedical visual question-answer dialogue. The data on how well people follow the instructions, and the Llava-Med model will be published to promote multimodal research in the field of biomedicine.