研究兴趣

论文共 18 篇作者统计合作学者相似作者

按年份排序按引用量排序主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
Kento Kawaharazuka, Tatsuya Matsushima, Andrew Gambardella,Jiaxian Guo, Chris Paxton, Andy Zeng
引用0浏览0引用
0
0
引用36浏览0EIWOS引用
36
0
引用0浏览0引用
0
0
加载更多
作者统计
  • 合作者
  • 学生
  • 导师
暂无相似学者,你可以通过学者研究领域进行搜索筛选
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
s invariance within a single environment, a relational head is proposed to enforce the similarity between $\\hat{{Z}}$ from the same environment. As a result, the redundant information will be reduced in $\\hat{Z}$. We empirically show that $\\hat{{Z}}$ estimated by our method enjoy less redundant information than previous methods, and such $\\hat{{Z}}$ can significantly reduce dynamics prediction errors and improve the performance of model-based RL methods on zero-shot new environments with unseen dynamics. The codes of this method are available at \\url{https:\u002F\u002Fgithub.com\u002FCR-Gjx\u002FRIA}.","authors":[{"id":"5622b15945cedb33983f1471","name":"Jiaxian Guo"},{"id":"53f44856dabfaeee229ff4d4","name":"Mingming Gong"},{"id":"53f48c1fdabfaea7cd1ce3f8","name":"Dacheng Tao"}],"citations":{"google_citation":4,"last_citation":3},"create_time":"2022-04-14T09:09:31.822Z","hashs":{"h1":"riaud","h3":"gmrl"},"id":"6257c5a95aee126c0f467ae3","keywords":["Model-Based Reinforcement Learning","Unsupervised Dynamics Generalization"],"lang":"en","num_citation":13,"pdf":"https:\u002F\u002Fcz5waila03cyo0tux1owpyofgoryroob.aminer.cn\u002F52\u002F55\u002F69\u002F52556973F5E4A7647D00EAC4A7A8F7EF.pdf","pdf_src":["https:\u002F\u002Fopenreview.net\u002Fpdf\u002Fa6c6a600f9e89fe92c0e2d8df1d09d0a78dd39ad.pdf","https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.04551"],"title":"A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning","update_times":{"u_a_t":"2022-10-19T08:59:12.591Z","u_c_t":"2024-01-02T12:47:32.328Z","u_v_t":"2022-10-18T17:30:01.296Z"},"urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2206.04551","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Ficlr\u002FGuoGT22","https:\u002F\u002Fopenreview.net\u002Fforum?id=YRq0ZUnzKoZ"],"venue":{"info":{"name":"International Conference on Learning Representations (ICLR)","name_s":"ICLR"}},"venue_hhb_id":"5ea1d518edb6e7d53c0100cb","versions":[{"id":"6257c5a95aee126c0f467ae3","sid":"YRq0ZUnzKoZ","src":"conf_iclr","vsid":"ICLR.cc\u002F2022\u002FConference","year":2022},{"id":"62a2b6925aee126c0f4d6dca","sid":"2206.04551","src":"arxiv","year":2022},{"id":"634d805190e50fcafd4dfb46","sid":"conf\u002Ficlr\u002FGuoGT22","src":"dblp","vsid":"conf\u002Ficlr","year":2022},{"id":"6392ac4090e50fcafd9f467c","sid":"journals\u002Fcorr\u002Fabs-2206-04551","src":"dblp","vsid":"conf\u002Ficlr","year":2022}],"year":2022},{"abstract":"Large language models (LLMs) have demonstrated excellent zero-shot generalization to new language tasks. However, effective utilization of LLMs for zero-shot visual question-answering (VQA) remains challenging, primarily due to the modality disconnection and task disconnection between LLM and VQA task. End-to-end training on vision and language data may bridge the disconnections, but is inflexible and computationally expensive. To address this issue, we propose \\emph{Img2Prompt}, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training. In order to provide such prompts, we further employ LLM-agnostic models to provide prompts that can describe image content and self-constructed question-answer pairs, which can effectively guide LLM to perform zero-shot VQA tasks. Img2Prompt offers the following benefits: 1) It can flexibly work with various LLMs to perform VQA. 2)~Without the needing of end-to-end training, it significantly reduces the cost of deploying LLM for zero-shot VQA tasks. 3) It achieves comparable or better performance than methods relying on end-to-end training. For example, we outperform Flamingo \\cite{Deepmind:Flamingo2022} by 5.6\\% on VQAv2. On the challenging A-OKVQA dataset, our method even outperforms few-shot methods by as much as 20\\%.","authors":[{"id":"5622b15945cedb33983f1471","name":"Jiaxian Guo"},{"id":"64ddf90806788bdef7989196","name":"Junnan Li"},{"name":"Dongxu Li"},{"id":"6372fb91ec88d95668d3ce5f","name":"Anthony Meng Huat Tiong"},{"name":"Boyang Li"},{"name":"Dacheng Tao"},{"id":"562cf9b445cedb3398d1a1e4","name":"Steven C. H. Hoi"}],"create_time":"2024-02-11T09:23:18.255Z","hashs":{"h1":"itpzv","h3":"fllm"},"id":"65786e57939a5f4082ce4ac8","keywords":["frozen large language"],"num_citation":0,"title":"From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language\n Models","urls":["https:\u002F\u002Fopenalex.org\u002FW4312107707"],"venue":{"info":{"name":"arXiv (Cornell University)"}},"versions":[{"id":"65786e57939a5f4082ce4ac8","sid":"W4312107707","src":"openalex"}],"year":2022}],"profilePubsTotal":18,"profilePatentsPage":0,"profilePatents":null,"profilePatentsTotal":null,"profilePatentsEnd":false,"profileProjectsPage":1,"profileProjects":{"success":true,"msg":"","data":null,"log_id":"2ckFMTrWZeIanxltP648UueI0oX"},"profileProjectsTotal":0,"newInfo":null,"checkDelPubs":[]}};