Generative AI: State of Play, Part 1

In the last few years, AI startups began testing new business models trying to find a niche for generative content. They did this by actively engaging users. But more importantly, by creating APIs to their platforms. Many have heard of and even used products from Phind, ChatGPT, and Midjorney.

Recently, I’ve been working with related Generative AI products and tools, and researching how Generative AI is used by colleagues. I began to wonder about the following questions:

Will content creation professions disappear?
How exactly will AI tools affect certain industries?
What are the advantages of humans compared to Generative AI applications?

Let’s deal with these questions together.

Current applications and programs

Currently, there are applications and programs in the following areas:

Image Generation.
Text Generation.
Code Generation.
Video Prediction/Generation.
3D Shape Generation.
Text-to-Speech Generator.
Speech-to-Speech Conversion.
Music Generation.
Semantic Image-to-Photo Translation.
Image-to-Image Conversion.
Image Resolution Increase.

At a minimum, Generative AI will seriously affect industries where opensource is used. This includes scenarios requiring data generation, summarizing, and contextual clarification. In imaging, it will influence using a certain format of images, videos, and 3D graphics.

Though quite an impressive technology, Generative AI uses machine learning, language models, and graphical models annotated and labeled by humans. Humans are still vital in generating ideas. The idea to create this article arose on my own. But, I was inspired to write the text by the use of Generative AI. These models use human-generated content, and now we can use AI-generated content for inspiration and new ideas.

Midjourney. Description/Prompt: Ukrainian Carpathians Montane meadow
photograph, photorealistic 8K, HD …

Despite advances in AI/ML, all learning is human-directed and human-assisted. Most of the data the models are trained on is publicly available. There is also a wealth of private data that is available to humans and is mostly not used to train a model. For example, internal corporate knowledge systems, closed-source databases, and libraries.

How not to get lost in generated content?

The availability of ChatGPT has caused active, and even heated discussions regarding the expediency and ethics of using the technology in the field of education – when passing professional certifications, when answering exams, etc. StackOverflow has updated its usage policies and banned the use of ChatGPT. New York Department of Education blocks ChatGPT on school devices and networks.

Such assistants and tools will occupy their niche and significantly speed up work with data. However, the results will still be checked by people with relevant experience to validate and apply the answers.

Validate output of Generative AI

The ultimate truth with text-generative AI is that from the other side of the screen, we need people. Suppose you aren’t a subject matter expert in a specific area. At first glance, it may seem that generated text is correct. But there are many examples where generated content contains all the needed data, acronyms, and terms but with nonsense or critical mistakes.

So we need someone who can validate Generative AI output.

Many startups and applications are emerging at the intersection of different areas of content generation. Amusingly, there are even platforms for generation of sites and materials for the sole purpose of launching startups.

There are many cases where generated images are published and presented as images of real events or people. Such cases create a demand for recognition tools to validate the images. Perhaps image generation companies and projects will be able to add certain pixels to mark the image as generated. So far, there are initiatives from artists who label their images with the aim of banning them from being used in training AI models. For example, NO AI. Also some artists are suing for copyright infringement.

Will tools for recognition of generated texts and images appear? And how quickly? Some tools already exist, such as this Deepfake Detection Challenge Dataset, and AI text classifier for detecting text generation. For faces, for example, there is a tool to protect your privacy when posting pictures on the network, Fawkes.

What is the reason for such growth of AI startups and companies in the last year?

I assume that this is a cumulative effect of the following:

an increase in the prevalence of data scientists and, as a result, the number and quality of scientific publications and citations increases
financial investments in this direction
the wide availability of processing power with a decrease in their cost

In previous years, many resources and investment were directed to AI companies. Universities that have traditionally researched AI/ML have begun to develop this direction more in the last 5-10 years. The number of relevant departments, students, and scientific staff constantly increased. Commercial companies could cooperate with relevant universities and create their projects and R&D.

Over the past five years, the organizers of conferences, workshops, and seminars began to attract more relevant speakers. Currently, most conferences, IT events/exhibitions have separate sections or zones with AI/ML.

What are the current limitations of Generative AI

The first is the self-limitations of any available platform, which are specified in the Term of Use. Many models have input text filters that describe what to generate. For example, restrictions apply to the creation of content that incites hatred, the formation of fakes, materials containing explicit content. In addition, the output size is limited for image generation. For example, available size options: 256×256, 512×512, 1024×1024. That is, if you want to create an image of a non-standard size, for now, you will have to use the work of humans.

There are problems with displaying relevant text on generated images:

Prompt for OpenAI: RESTfull API security

The option with a direct indication of the text that needs to be indicated on the billboard also gives an undesirable result.

Billboard with text ‘Hi there’

Tools may also impose restrictions because the tool itself does not yet have full control of supporting the desired limitations set forth in the terms and conditions. For example, the source code of Imagen Video has not yet been published for reasons related to sensitive content filtering. In addition there are some self-limitations of platforms or organizations that are imposed on the use of Generative AI results. Again, this is mainly due to the inability to control the quality of the content.

So, where there are technical limitations and self-imposed limitations, human labor will still be involved in creating relevant content.

Ready to keep going? Read Part 2 of this series here.

Developer

Generative AI: State of Play, Part 1

Current applications and programs

How not to get lost in generated content?

Validate output of Generative AI

What is the reason for such growth of AI startups and companies in the last year?

What are the current limitations of Generative AI

Learn more about Generative AI
Try this “Explore Generative AI” learning lab

Authors

Oleksii Borysenko

Developer Advocate

DevNet

Join the Next DevNet Event

Developer

Generative AI: State of Play, Part 1

Current applications and programs

How not to get lost in generated content?

Validate output of Generative AI

What is the reason for such growth of AI startups and companies in the last year?

What are the current limitations of Generative AI

Learn more about Generative AI Try this “Explore Generative AI” learning lab

Authors

Oleksii Borysenko

Developer Advocate

DevNet

Join the Next DevNet Event

CONNECT WITH US

Learn more about Generative AI
Try this “Explore Generative AI” learning lab