CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

How good are code LLMs?

Jun 02, 2023

Paper: CodeTF: One-stop Transformer Library for SOTA Code LLM [15 pages]

Researchers at Salesforce AI are developing a one-stop Python transformer-based library for code large language models (Code LLMs) and code intelligence, provides a seamless interface for training and inferencing on code intelligence tasks like code summarization, translation, and generation. It aims to facilitate easy integration of cutting-edge language models into real-world applications.

The current version of the library offers:

Support for Model Serving (CodeBERT, CodeT5, CodeGen, CodeT5+, Incoder, StarCoder, etc.).
Fine-Tuning Your Own Models: They provide an API for quickly fine-tuning your own LLMs for code using SOTA techniques for parameter-efficient fine-tuning (HuggingFace PEFT) on distributed environments.
Multiple Supported Tasks including nl2code, code summarization, code completion, code translation, code refinement, clone detection, defect prediction.
Preprocessed on well-known benchmarks (Human-Eval, MBPP, CodeXGLUE, APPS, etc.)

Source: https://lexica.art/prompt/9e781327-f5a3-447a-91ca-2ae9c98f12bc

In future version they plan to have:

Enabling even large models such as InstructCodeT5+ to run efficiently on commercial laptops or workstations.
Conducting comprehensive evaluations of coding benchmarks
Enhancing the Code Utility module by adding support for other programming languages, such as Go, Rust, C#, and more.
They plan to include utilities for extracting additional useful features from code, such as call graphs, control flow, and data flow.

So essentially,

"Salesforce AI has a great Code LLM called CodeTF!"

So Essentially

Discussion about this post