Coding with Large Language Models

Overview


In the article “Managing the development of large software systems: concepts and techniques.”, Winston W. Royce describes his personal experiences from software development, outlining patterns of structuring work. The first pattern in the article, shown in Figure 1, is a simple pattern highlighting a step from analysis to coding.

Fig 1. -- In simple projects that are intended for internal use, a simple process from analysis of the problem to coding can be sufficient.

Fig 1. — In simple projects that are intended for internal use, a simple process from analysis of the problem to coding can be sufficient.

The analysis step relates to forming an understanding of the problem, while the coding step relates to the concrete implementation (writing code) of a solution for the problem. While Royce does not explicitly highlight this early on in the article, the analysis and coding can be done iteratively until a satisfactory solution is found.

In this part, we focus on how large language models can help us with the concrete coding task. The chapters in this part and their key concepts are as follows.

  • Source Code Generation refers to automatic generation of source code from a natural language description of a problem.

  • Code Completion refers to automatic completion of code from a partial code snippet.

  • Test Generation refers to automatic generation of tests for code, where the tests can also be code.

  • Code Translation and Rewrites refers to automatic translation of code from one language to another and rewriting of code to match specific style.

  • Working with APIs discusses challenges related to creating code that uses possibly unknown application programming interfaces (APIs).

  • Code Review and Summarization discusses the possibilities of reviewing code and creating summaries of code changes.

  • Finally, Utility and Trustworthiness of Generated Code highlights known issues related to the utility and trustworthiness of code generated by large language models.

Of these chapters, especially code generation and code completion are among the most widely studied areas of large language models for software engineering.

There's more than just..

Software engineering is much more than coding. We’ll get to that in the next parts of this material. For now, let’s focus on the coding part and get started with source code generation.