As a part of my work in the Cohere project, I had to learn about and utilise Docker. The purpose of this article is to impart some knowledge and lessons about Docker, knowledge that I wish I had known when I initially started utilising it. This blog post is split into two distinct sections: the first section presents a brief overview of Docker itself and its uses, and the second section provides some Windows-specific advice.
What is Docker?
Docker is an open platform that allows users to package up applications in containers with everything they need to run.
What does that mean exactly?
By using Docker, you can isolate an application from its environment and run it “in any environment, on any infrastructure and be written in any language.” Unlike when a VM is used, containers do not have an OS within them. Containerisation uses the host machine’s kernel to run multiple root systems (aka containers). Each container running on a host machine is completely isolated, so applications running on the same host are unaware of each other; however, Docker networking can be used to allow these containers to communicate.
Why is Docker useful?
- It solves the “but it works on my machine” problem that many teams face:
Because the developer can isolate the app from its environment.
- It makes collaborating on the application easier:
Docker containers encapsulate everything the application needs to run, avoiding the sharing of other unnecessary things. Thus, Docker allows applications to be easily moved between environments and run by any host with the Docker runtime/engine installed.
How we used Docker in the Cohere project:
- To emulate a Neo4J environment:
Using Docker you can easily package a database and its dependencies. Neo4j provides and maintains official Neo4j Docker images on DockerHub, the official website for Docker repositories. There are several ways to use Docker for Neo4j development and deployment. For local testing on the Cohere project, a Neo4j image was contained inside of a docker instance. This allows communication to a service that replicates a live Neo4j environment without the requirements of an internet connection or a running Neo4j server.
- To run our API using a set of environment variables:
As I previously mentioned, containerisation allows you to isolate an application and run it using a specified deployment environment. In our project, for example, this is particularly useful for testing the Cohere API end-to-end. After building the API, the binary of such a build can be used to create a container. This container can be given a file that contains all environment variables that are required to run the system locally. Testing against this container gives us the confidence that it is ready for deployment. It can then be used with another set of environment variables which allow it to connect to the live database.
So you have decided to use Docker on Windows, what now?
To begin, know that this journey will not be smooth, at times you will consider turning your host computer into a Linux based system; however, you are not alone. This section of the blog post aims to help you deal with Docker and, hopefully, save your time and a lot of grief and frustration.
On Windows, you have 2 ways to run containers:
This is the first decision you must make, and it is a critical one.
Should I run Docker on Hyper-V or WSL2?
Like many other Windows users, I initially thought that using Hyper-V was the easier choice, as it was available on my Windows device by default. However, WSL2 proves to be a faster and more effective choice and from my experience, it solves a lot of undocumented errors. This is because Microsoft designed WSL2 to use “a real Linux Kernel within a lightweight VM instead of using emulation.” As a result, this approach is more “lightweight and is more tightly integrated with Windows as opposed to using the Docker LinuxKit while enabling Hyper-V.”
Things to be careful with when using WSL2 or Hyper-V alongside a VM:
- Why is my VM suddenly not working?
DO NOT PANIC! This is completely normal. The reason your VM has stopped working is that you have now enabled either Hyper-V or WSL2 on your laptop and your VM will not work with these enabled.
Here are the instructions for how to fix this problem:
- From the start menu, click on Settings -> Apps -> Optional Features -> More Windows Features.
- Ensure that the following features are NOT ticked – if they are ticked, untick them:
- Virtual Machine Platform
- Windows Hypervisor Platform
- Windows Sandbox
- Windows Subsystem for Linux
- Restart your computer
Can I not use Docker while I am using my VM? Will I have to restart my computer every time I want to use Docker?
Unfortunately, the answer to both these questions is yes.
According to the documentation, Hyper-V can work alongside VMware 15.5. 5+ and VirtualBox 6+ as long as you have enabled “Windows Hypervisor Platform”; however, that was sadly not what I experienced, and this approach is not very reliable in practice and may result in performance problems.
For this reason, I provide you with a third solution:
Create a VM, install Ubuntu and download Docker.
The reason I did not initially suggest this solution is that it requires you to have a good amount of storage and RAM.
Common Docker on Windows errors that you should be aware of:
setup_1 | standard_init_linux.go:219: exec user process caused: no such file or directory
The problem: Windows-style file endings
The solution: Converting the file format to UNIX style using dos2unix:
Run this command on Git Bash-
Dos2unix.exe file-name.file ending
The error (in shortened form):
docker: Error response from daemon: status code not OK but 500: ☺����☺♀☻FDocker.Core, Version=18.104.22.168106, Culture=neutral, PublicKeyToken=null♣ocker.Core.DockerException♀ ClassNameMessage♦Data♫InnerExceptionHelpURL►StackTraceString▬RemWatsonBuckets☺☺♥♥☺☺☺☺☺▲System.Collections.IDictionary►System.Excepti☻☻♠ocker.Core.DockerException♠♦▲Filesharing has been cancelled
The problem: Unknown
The solution: Use WSL2 if you are using Hyper-V
To summarise, Docker is a great resource to exploit when collaborating in a team, as Docker containers are guaranteed to be identical on any system. By ensuring that all team members use the same Dockerfile, you can guarantee that all images built will be functionally identical. Additionally, by isolating the environment of the application, you can execute your code in the same environment as your server and can more easily identify problems with your application. You might run into a few issues using Docker on Windows; however, most of these issues can typically be avoided by using WSL2 and in the worst-case deleting and reinstalling Docker.