paint-brush
Accelerating Excavation and Refinement of Data Gold Minesby@jorgetorres
485 reads
485 reads

Accelerating Excavation and Refinement of Data Gold Mines

by Jorge TorresApril 2nd, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Data-driven decision-making is going to become a fundamental part of day-to-day operations. Generative AI and natural language processing (NLP) can be used to create predictive models based on large datasets. The real trick is being able to apply AI and NLP models seamlessly into databases.
featured image - Accelerating Excavation and Refinement of Data Gold Mines
Jorge Torres HackerNoon profile picture

There are many approaches to business in 2023, but one thing all leaders agree on is that the future of digital transformation lies in “the great dig” – unearthing the goldmine of data lying just beneath the surface. The trouble is, there is so much data flowing through businesses that it can be hard to separate the wheat from the chaff.


How does a business know when it has tapped the right vein of data ore, and more importantly, how can it gather and mobilize that data into actionable business intelligence?


How Can Businesses Convert Valuable Data Into Actionable Business Intelligence?

Increasingly, the obvious answer is a combination of artificial intelligence (AI) and machine learning (ML).  In other words, data science. AI models can be deployed to help businesses make predictions, take advantage of market opportunities, measure performance and innovation, and react optimally to external events beyond their control.


In a recent report, McKinsey posited that by 2025, AI-driven workflows and seamless interfacing between humans and machines will become standard. Put simply, data-driven decision-making is going to become a fundamental part of day-to-day operations, from menial tasks to broad-sweeping business decisions.


That’s all good and well, but the sheer volume of data combined with the intricacies of devising and deploying AI models makes this a little too good to be true. The amount of data businesses have access to has reached saturation point, and not all of that data is good or useful. When a business does “strike it rich” by tapping a nice supply of data, employing AI to turn that data into business insight can take weeks if not months, by which time any opportunity to put that data to good use has passed.


The current process for setting up AI models to interpret data is long and complex, requiring input from data scientists, developers, analysts, and users. According to Algorithmia’s 2021 report, once a use case for an AI model has been identified, it takes two-thirds of organizations more than a month to develop a useable model, and another whole month to deploy it. The amount of time and resources invested is wasted if the opportunities to apply learnings from that model have passed. What results is a revolving door of high-investment development and deployment with minimal gain.


Resource is key here. In order to implement AI modeling effectively, an organization will need advanced in-house knowledge of coding and machine learning, or at least a hefty budget to employ a third party.  Even then, the data efficiency bottleneck remains. Once an expensive team of experts has been assembled, they’ll need time to work on developing a usable model, and once that model is outdated, more time again. So how do businesses get ahead of the curve and speed up the excavation and refinement of their data?


The Answer Lies in Generative AI and Natural Language Processing (NLP)

Generative AI can be used to create predictive models based on large datasets. For example, an NLP language model could be trained on customer reviews to predict which products are likely to receive positive or negative feedback. It can also be used to train predictive models in real-time, which is particularly useful when there is limited historical data available, or when new and unexpected events occur that are not accounted for in existing data. A business that sells products online may use generative AI to create new variations of product images that can be used to train models to recognize and classify products more accurately. Or a business that analyzes customer behavior may use generative AI to create new customer profiles that can be used to train a machine learning model to predict customer preferences and behaviors more accurately. Generative AI can also be used to simulate different scenarios and outcomes, allowing businesses to explore the potential impact of different decisions before they’re made. This can help businesses to make more informed decisions, as well as identify potential risks and opportunities.


The real trick is being able to apply generative AI and NPL model training into databases seamlessly. This is the hump many businesses fail to get over, or if they do get over it, it’s expensive, resource-intensive, and too time-consuming, not to mention the sheer level of expertise required to make it happen. This technology needs to be democratized to the point where virtually all users, with a little bit of training, can use it. There are solutions out there that do just that, allowing users to bring state-of-the-art natural language processing and generative AI models into their databases with just a few lines of basic SQL.


What results is a form of real-time in-database machine learning. Instead of AI models existing on the periphery, separate from a business database, the model becomes one with the database and provides real-time insights into datasets that can be actioned in a matter of hours and days instead of weeks and months. If the hunt for high-value data is “the great dig”, then the democratization of generative AI and NPL are the shovels that will help businesses unearth it.



Lead image generated with stable diffusion.

Prompt: Illustrate a gold mine.