DataWorksAI Text-to-SQL LLM chatbot
This research project involves Master’s students in Analytics and Machine Intelligence at Northeastern University, Boston, MA
This project aims to improve the accuracy of a text-to-SQL chatbot’s human-to-text querying using the OpenAI platform and GPT LLM Transformer technologies. It allows users to query a database interactively on the Mass Open Cloud platform. The DataworksAI text-to-SQL chatbot uses Lang Chain, LLama-Index frameworks, the RAG pipeline, ChromaDB, a vectorDB, and the OpenAI platform. The data used for testing the accuracy of the chatbot query results is the Integrated Postsecondary Education Data System, which is open-source data from the National Centre for Education Statistics and hosted locally using Streamlit. The development of the chatbot will leverage OpenShift AI from NERC under the mentorship of the following Engineers at Red Har.