Docker is an open-source software platform that can package software into standardized units called containers. They contain everything needed to run the software, including libraries, system tools, code and runtime. You can not only create these virtualized application containers with Docker, but also deploy and manage them.
Docker was originally developed to make microservices much easier to manage and deploy. Microservices themselves are based on a very old idea of software development: modularization. Dividing software into modules makes it easier to understand because only one module at a time needs to be understood by a developer. Modules help with this in development. The deployment of the applications brings all modules together into production.
This is exactly where microservices differ from other modularization approaches. Microservices are modules that can be put into production independently. To achieve this goal, microservices are, for example, their own processes, virtual machines or Docker containers as lightweight alternatives to virtualization.
TOLERANT Software’s data quality products are ideally suited for use as microservices. In particular, address validation (TOLERANT Post), bank data validation (TOLERANT Bank) and name validation (TOLERANT Name) are products that do not need to persist data during operation. They can therefore be deployed particularly simply as Docker containers. TOLERANT Match, the fault-tolerant search and duplicate detection, is also well suited for Docker container technology. A particular challenge here, however, is the persistence of index data.
For example, a typical Dockerfile (simplified) for TOLERANT Match looks like this:
The image can also be saved as a .tar file:
docker save <imagename(e.g. tolmatch:8.0)> -o <outputfilename>.
If you haven’t just created the Docker image yourself, you have to load it into the “Docker repository” first using the .tar archive created above:
docker load -i <path to tar archive>
# start with mounted volumes (-v) and mapped port. (example)
docker run -name tolmatch -user <yourPreferredUser>
As you can see from the startup parameters, you still need to think a bit about the port mapping between the container and the host system. For this purpose the parameter“-p” is used in the call.
TOLERANT Match itself persists data in several places:
- Application data (directory /opt/tmatch8/data)
- configuration data (directory /opt/tmatch8/config)
- log information during operation (directory /opt/tmatch8/logs)
If you want to prevent the data from being lost after a container restart, you must store the directories on a volume (or a host system directory) outside the container. To do this, use the “-v” parameter.
This post is also available in DE.