Code Completion
Learning Objectives
- You know the term code completion and know how large language models can be used to aid in code completion.
Code completion is similar to source code generation, with the exception that code completion focuses on filling in missing parts of a code. Code completion can be broadly categorized into filling in missing parts of a code snippet (code infilling) and completing a code snippet (code completion).
In the example below, we demonstrate code completion. We’ll continue with the quicksort theme — the prompt implicitly asks for a function that implements the Quicksort algorithm. The model then generates the code for the function.
Please complete the following function.
def quicksort(array):
Certainly! Here's the quicksort function:
def quicksort(array):
if len(array) <= 1:
return array
pivot = array[len(array) // 2]
left = [x for x in array if x < pivot]
middle = [x for x in array if x == pivot]
right = [x for x in array if x > pivot]
return quicksort(left) + middle + quicksort(right)
...
Similarly, below, we demonstrate code infilling. In the example, we have provided a comment on where the code should be filled. For simplicity, we are again using the same quicksort algorithm.
Please fill in the details for the following function:
def quicksort(array):
if len(array) <= 1:
return array
pivot = array[len(array) // 2]
# fill in details here
return quicksort(left) + middle + quicksort(right)
Certainly! The main idea behind the quicksort algorithm is to select a 'pivot' element from the array and partition the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. For the sub-arrays, the algorithm is recursively applied.
Here's the detail filled in for your function:
def quicksort(array):
if len(array) <= 1:
return array
pivot = array[len(array) // 2]
left = [x for x in array if x < pivot]
middle = [x for x in array if x == pivot]
right = [x for x in array if x > pivot]
return quicksort(left) + middle + quicksort(right)
...
For large language models that work as chatbots, like the one that we have on this page, both code completion and code infilling generate the full solution. The difference is in the input, where in the case of infilling, the input is a code snippet with missing parts, while in the case of completion, the input is a code snippet with beginning parts already filled in.
For both code completion and code infilling, we can also provide additional context for the large language model. The additional context could be in form of code comments, or in the case of Python using a docstring, which is a description of the functionality that is provided as a comment in the code. The model then completes or fills in the code for the function, potentially matching the description in the docstring. In the example below, the function name is more generic, but the docstring outlines what the actual implementation is.
Please complete the following function:
def sort(array):
"""An implementation of the bubble sort algorithm"""
Sure, here's a simple implementation of the bubble sort algorithm:
def sort(array):
"""An implementation of the bubble sort algorithm"""
n = len(array)
for i in range(n):
for j in range(0, n-i-1):
if array[j] > array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
return array
...
Code generation and code completion have different emphases. For code generation, the emphasis is on the natural language description of the code, while for code completion, the emphasis is on the code itself.
Depending on the intended use, there are also differences in the requirements from the models. Code completion could be used in a live programming environment, where the programming environment suggests completions continuously. In such a case, the model would need to be able to respond very rapidly. For code generation, on the other hand, the model could be used to generate code for a larger task, and hence the speed at which the model responds would not have to be as fast as for code completion (even though waiting for a response for a long time harms usability).
Although the above examples have been constructed using a dialogue with a large language model, there are also development environment integrations. As an example, GitHub Copilot can be integrated into Visual Studio Code.
GitHub Copilot is available for teachers and students free of charge.