The new wave of AI/ML-powered coding tools are changing how software is developed. But ‘coding’ is a broad definition and these tools can help developers in a variety of ways. We’re already seeing their use fall into a few different categories (often related to the underlying technology), like interactive code suggestion and fully autonomous test-writing. Knowing which tools do what can help you decide how to adopt generative AI to power up productivity in your software development workflows.
It’s important to note that there’s no ‘silver bullet’. Despite all the talk about a ‘general AI’, software delivery tool chains will be made up of multiple best-in-class options for some time yet. Many different variables will have an impact on success – not least code complexity and developer experience – with varying levels of human input needed to get up and running, or to check and verify outputs for correctness.
Using generative AI for software development
Code Suggestion
Code suggestion tools can automatically propose code based on natural language prompts. They typically use some sort of conversational interface that can be used to iteratively create a program – sometimes even when the user has no coding experience. These tools aren’t necessarily designed to be used inside a software development workflow (at least not today) and their output can be in a variety of programming languages.
Code Completion
Code completion tools are designed to assist developers by predicting the code they want to write based on comments, the initial code they’ve written and the code that surrounds the section in question. This can include auto-completion of variable names, function calls, and even entire code structures. Code completion tools typically work within the Integrated Development Environments (IDEs), like IntelliJ or Visual Studio Code, that developers would normally use to write code.
Code Explanation
It can be hard to understand how existing code behaves, especially in large enterprise systems. AI coding tools can help developers by providing explanations for specific lines or blocks of code. They can identify patterns and relationships and generate explanations based on this knowledge. Some can also provide links to documentation, tutorials, and examples that can help developers learn more about the code. These types of tools can be particularly useful for new developers who are unfamiliar with the codebase or experienced developers who need a quick refresher on a specific piece of code.
Unit Test Writing
AI can also be used to automatically write unit tests – a tedious, repetitive task that otherwise requires a high level of coding expertise and deep understanding of the code under test. Tools like Diffblue Cover rely on Reinforcement Learning (RL) approaches to produce tests that are guaranteed to run, compile and be correct. This helps to identify regressions earlier in the development process and allows developers to write more code, leading to more efficient development and higher-quality software.
5 of the Top AI for Code tools available today
Here are some popular generative AI tools which are already helping software developers speed up their workflows, reduce the time spent on repetitive and tedious tasks, and free up time to spend on more innovative, creative and enjoyable work.
1. Replit Ghostwriter
Replit Ghostwriter is an ML-powered code completion tool that leverages Large Language Model (LLM) AI to provide suggestions based on the code in your current file. It can be used for a wide range of applications, including generating code snippets, writing documentation, creating blog posts, and more. Ghostwriter is designed to complement your existing programming knowledge and reduce the time you spend searching for help or studying code examples on sites like Stack Overflow.
Ghostwriter helps reduce friction by using code and comment context. It can refactor your code to run faster and translate it into another language (though not yet at scale – we’re definitely still in the realm of ‘pair programming’ for specific code).
Ghostwriter can only be used in Replit’s browser-based IDE. The plus side is that users don’t need to download anything and can benefit from the original collaboration features of Replit’s platform. But it’s a significant problem for any company where writing code (often valuable intellectual property) in someone else’s cloud environment will be out of the question.
Replit has leveraged OpenAI’s GPT models in the past, though they recently received significant investment from Google. As with any text generation tool powered by an LLM, the output generated by Ghostwriter may produce inaccurate or irrelevant suggestions (‘hallucinations’), so users need to review and validate the generated content for reliability, accuracy and quality.
2. Tabnine
Tabnine is an AI code completion tool that also uses a Large Language Model to help developers write code more efficiently. Tabnine uses a modified version of OpenAI’s original open source GPT-2 AI model to automatically suggest whole-line and full function code completions. It supports multiple programming languages, including popular ones such as Python, JavaScript, Java, C++, and more.
Tabnine integrates with widely-used code editors and IDEs, making it accessible to a broad spectrum of developers across different programming environments. Although it is not an end-to-end code generator, it enhances an IDE’s auto-completion capability.
Tabnine offers an Enterprise edition which can be set up to run inside an organization’s development environment so that no code or data is sent to an external cloud service or server – a feature enabled by the use of the relatively small GPT-2 model.
Some users report that the tool’s UX can get busy with irrelevant suggestions, and that it uses a lot of memory on top of your IDE’s existing requirements.
Just like Ghostwriter, Tabnine’s suggestions are based on patterns and examples and may not always be correct. ‘Hallucination’ is common, and overreliance on suggestions without careful review of the suggested code can lead to errors and unintended consequences. It’s important for developers to validate all suggested code to ensure its accuracy, security, and compliance with coding standards and project requirements.
3. GitHub Copilot
GitHub Copilot is a code completion tool from Microsoft that is designed to assist developers in writing code by providing context-aware suggestions based on ML algorithms and vast amounts of code from publicly available sources. Copilot was originally based on an LLM called Codex, which was a derivation of OpenAI’s earlier GPT-3 model. Though it was trained specifically on code, the performance of Codex was superseded by the general GPT-3.5 and GPT-4 models (also used for ChatGPT). At the time of writing it’s based on GPT-3.5, though an upgrade is expected.
Copilot integrates with popular code editors and IDEs, allowing developers to receive real-time suggestions while writing code, but in future users of Visual Studio Code (also owned by Microsoft) might see more integrated features. It supports multiple programming languages, including Python, JavaScript, Java, C++, and more.
Copilot suffers from the same problem as other LLM-based tools: manual supervision is always required. It provides suggestions based on learned patterns and examples from code which may not be perfect, or sometimes even relevant, so developers must always check outputs for accuracy. Microsoft’s description of Copilot as ‘Your AI Pair Programmer’ is perhaps the clearest signal of how these tools are designed to be used.
Copilot currently sits somewhere between Ghostwriter and Tabnine in terms of data security. It integrates into a developer’s desktop IDE but operates as a cloud-based service, sending code snippets to its servers for analysis and suggestions. A private version of Copilot is expected and users now have greater control of how their inputs are used for AI training, but some security and compliance questions may remain for regulated enterprises.
The use of closed-source models like GPT-4 may also be of concern. OpenAI’s models are now closed-source, meaning there is limited public information as to how they were trained, what data was used, and so on. These factors might have implications for the tool’s output, for example in terms of copyright.
4. ChatGPT
ChatGPT probably doesn’t even need an introduction at this point, but let’s fill the gap for the sake of blog consistency! ChatGPT is a conversational NLP interface for OpenAI’s LLMs that has arguably done more to raise awareness of what AI can do than any other tool to date. It’s currently based on either GPT-3.5 or GPT-4 depending on which subscription tier you opt for (at the time of writing GPT-4 is OpenAI’s latest closed-source AI model). ChatGPT was not conceived with any specific application in mind, but since code and related documentation is largely text-based development has proved to be an interesting application for the technology. Alternatives to ChatGPT are already beginning to proliferate, for example Bard from Google.
ChatGPT’s output is generated by natural language ‘prompts’ entered by the developer. Prompts can provide context and be iterative, allowing ChatGPT to be used for tasks beyond new code suggestions like summarizing current code behavior or proposing improvements. Some developers are now using ChatGPT in combination with code completion tools like Copilot, though it seems likely their respective functionality will merge in future.
For all its potential, ChatGPT has its downsides. Like Copilot, it is based on closed-source LLMs created by OpenAI; all of the same considerations apply around commercial use and security of valuable IP.
Crucially, though natural language prompts underpin the scope of possibilities offered by ChatGPT they can also be a point of weakness. Output is intimately connected to the inputs used, and small ‘prompt’ changes can have a disproportionate impact. Continuity, repeatability and consistency can be challenging. The NLP interface may also appear more naturally convincing: the confidence of its mistakes is already a problem.
5. Diffblue Cover
Diffblue Cover is an AI-powered automated unit test writing tool for Java applications. Rather than LLMs, it uses Reinforcement Learning (RL) to automatically write tests for Java code, to prevent regressions and improve overall code quality.
Cover writes and maintains entire Java unit test suites completely autonomously, eliminating tedious, repetitive, error-prone work from developer workloads and catching dangerous regressions sooner when application code is changed.
Cover operates at any scale, from method-level to across an entire codebase. It is capable of creating tens of thousands of tests completely automatically and can be integrated into an automated Continuous Integration pipeline. Unlike code suggested by the tools we’ve looked at so far, the code written by Cover is guaranteed to compile, run and be correct. That makes it the only effective, practical solution for automatic Java unit test writing and maintenance at an enterprise scale.
It also provides insights into the code coverage achieved by the generated tests, helping to identify areas of the code that may not be thoroughly tested.
As it is specifically designed for Java applications, the tool can’t yet be used on code written in other programming languages. Currently it can only be used for the specific – though incredibly challenging – task of writing and maintaining unit tests at scale without the need for developer involvement.
6. Other New Tools
The world of AI is progressing at breakneck speed – and therefore AI-powered coding tools are doing the same!
We’ll do our best to keep this article updated with the latest new additions – Duet AI from Google and the StarCoder project (supported by HuggingFace and ServiceNow) are just two of the high profile new options to have appeared recently (though neither is in widespread use yet).
Summary
Generative AI-powered tools are already changing the way developers work by accelerating code writing, removing tedious, error-prone manual effort and providing a new, faster way to expand their knowledge base.
But it’s important to remember that despite the pace of recent change AI-powered code generation tools are still in their early stages of development and may not be suitable for every scenario. Development teams should carefully consider the benefits and potential problems of using such tools before integrating them into their development process. Things are changing fast, and there’s unlikely to be a one-size-fits-all solution.
Get in touch to learn more about generative AI for Code and how it can help to increase the productivity of your Java development teams.