Paper: CodeTF: One-stop Transformer Library for SOTA Code LLM [15 pages]
Researchers at Salesforce AI are developing a one-stop Python transformer-based library for code large language models (Code LLMs) and code intelligence, provides a seamless interface for training and inferencing on code intelligence tasks like code summarization, translation, and generation. It aims to facilitate easy integration of cutting-edge language models into real-world applications.
The current version of the library offers:
Support for Model Serving (CodeBERT, CodeT5, CodeGen, CodeT5+, Incoder, StarCoder, etc.).
Fine-Tuning Your Own Models: They provide an API for quickly fine-tuning your own LLMs for code using SOTA techniques for parameter-efficient fine-tuning (HuggingFace PEFT) on distributed environments.
Multiple Supported Tasks including nl2code, code summarization, code completion, code translation, code refinement, clone detection, defect prediction.
Preprocessed on well-known benchmarks (Human-Eval, MBPP, CodeXGLUE, APPS, etc.)
In future version they plan to have:
Enabling even large models such as InstructCodeT5+ to run efficiently on commercial laptops or workstations.
Conducting comprehensive evaluations of coding benchmarks
Enhancing the Code Utility module by adding support for other programming languages, such as Go, Rust, C#, and more.
They plan to include utilities for extracting additional useful features from code, such as call graphs, control flow, and data flow.
So essentially,
"Salesforce AI has a great Code LLM called CodeTF!"