Geo-FuB: A Method for Constructing an Operator-Function Knowledge Base for Geospatial Code Generation Tasks Using Large Language Models
Shuyang Hou, Anqi Zhao, Jianyuan Liang, Zhangxiao Shen, Huayi Wu·October 28, 2024
Summary
Geo-FuB, a framework for constructing a geospatial operator-function knowledge base using large language models, addresses coding issues in geospatial code generation. It integrates retrieval-augmented generation with an external knowledge base, comprising Geo-FuSE for semantic feature extraction, Geo-FuST for operator combination identification, and Geo-FuM for aligning these with the function semantic framework. Evaluated on 154,075 Google Earth Engine scripts, Geo-FuB achieves 88.89% overall accuracy, with 92.03% structural and 86.79% semantic accuracy. This framework enhances the reliability of large language models in geospatial code generation, offering a solution to challenges like "coding hallucinations" and improving the integration of domain-specific expertise.
引言
背景
地理空间代码生成中的编码问题概述
目标
Geo-FuB研究的主要目标
方法
数据收集
大型语言模型数据来源与收集方法
数据预处理
数据预处理策略与技术
Geo-FuB框架
Geo-FuSE:语义特征提取
功能与实现
Geo-FuST:操作组合识别
功能与实现
Geo-FuM:与函数语义框架对齐
功能与实现
评估
实验数据集
评估使用的数据集细节
评估指标
评估指标的定义与选择
评估结果
准确率:整体88.89%,结构92.03%,语义86.79%
地理空间代码生成的可靠性提升
解决编码幻觉问题
Geo-FuB如何解决编码幻觉
集成领域专业知识
Geo-FuB如何增强大型语言模型在地理空间代码生成中的可靠性
结论
总体贡献
Geo-FuB对地理空间代码生成领域的贡献
未来展望
Geo-FuB未来的研究方向与潜在应用
Basic info
papers
databases
software engineering
artificial intelligence
Advanced features