Azure Blob storage is Microsoft’s object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn’t adhere to a particular data model or definition, such as text or binary data. Any file, image, text can be uploaded to blob store.
There could be some scenarios where data need to be encrypted before getting uploaded to blob store. …
We already have POI utilities provided by Apache to write/read an excel file in Java. If there are less than 50K rows to be inserted in a file, any excel utility can be used. But, when we have a use-case to insert a million rows in an excel file, we need to look at the utilities to find the best one for our use-case.
Let’s take an example to understand it better. Suppose our Java application generates a report for the documents queried from Elasticsearch by a user. The given query has 1 million documents to be returned, so the query should be made in batches to avoid such a big response(1 million documents) from Elasticsearch. Batch size is defined (let’s say 5000). …
Every Java application requires memory to run on JVM. This memory is taken from the available RAM of the system where the application is running. There are 2 kinds of memory: Stack and Heap.
It is the region of the RAM which is used to store the temporary variables or primitive data types in Java. It also stores the references for the objects that are physically created in heap.
It stores the variables created by the functions in the Last-in-First-out (LIFO) format and frees all the allocated memory when the function exits.
The stack is managed for each thread in Java so its scope is within the thread. It is smaller as compared to the heap’s size. …
Kafka is an open-source stream processing platform. It is developed to provide high throughput and low latency to handle real-time data.
Before we read about how to make our Kafka producer/consumer production-ready, Let’s first understand the basic terminologies of Kafka.
Kafka topic: It refers to a family or name to store a specific kind of message or a particular stream of data. A topic contains messages of a similar kind.
Kafka partition: A topic can be partitioned into one or more partitions. It is again to segregate the messages from one topic into multiple buckets. …
We live in an era where applications run on a huge volume. To make our applications search-efficient and space-economic, we need to truncate the aged data from the data store. Removing old data helps in reducing the search space where the query will be running to retrieve the results plus requires less hardware to store the documents. Removing an individual document from an Elasticsearch index is quite an expensive operation. Elasticsearch provides a better way to achieve this. …
We learned about Docker in this article how Docker helps us to have an easy and efficient deployment of an application. Dockers are scalable, but it requires a manual effort to achieve it. There are some problems we encounter if we don't use any docker container orchestrator.
In a production environment, we really need to think about these problems to have a robust, highly available, economical application. Here containers orchestrator comes to rescue us. There are many orchestrators available today where Kubernetes from Google is the most famous and used one. …
Java is a nice language that offers sequential, parallel, and asynchronous programming by creating lightweight processes (known as Threads) programmatically. It helps us to write an efficient program to achieve something.
Let’s first understand what are these 3 ways of a program?
In order to understand functional programming, we need to understand some basic concepts first.
In my previous post, I tried to explain how to implement pagination in Cassandra? Here, We’ll be looking at how data is written, read, updated, and deleted in Cassandra. Cassandra is a horizontally scalable NoSQL database.
Before we talk about how does Cassandra maintain data, we first need to understand basic terminologies:
Before we start talking about docker, we need to understand the problem which is solved by Docker efficiently and economically. Before Docker gained popularity, Companies used to use virtualization for running multiple applications as the different applications might need different sets of libraries and OS to run.
Why do organizations prefer virtual machines instead of provisioning actual servers to host their applications?