The creative power of the human mind has often been recognized as the greatest strength in art. The ability to internalize real-world circumstances and convey thought in visual form, storytelling, or music is a facet of human society that dates back to the beginning of recorded history. The sanctity of the human mind in the realm of art has long been undisputed, but modern technology has posed counter-arguments to the claim that sensitivity is necessary to produce creative works. Artificial intelligence, or AI, is a broad category of machine learning technology in which computer programs are exposed to data and then begin to operate independently to complete tasks. A recently announced program has demonstrated capabilities beyond the limits of its contemporaries and has unlocked the yet unforeseen power of AI-generated art.
The new program, known as DALL-E, has demonstrated that the sky is the limit for creative artificial intelligence. DALL-E was developed in 2021 by OpenAI, an artificial intelligence lab that has spent the past seven years programming apps that come close to human capability in various fields. The platform takes its name from two starkly different influences: Spanish painter Salvador Dali and the lovable robotic protagonist of Pixar’s “WALL-E.” He has attracted a devoted following online for his groundbreaking ability to understand complex sentences and produce unique and original computer-generated visuals based on written sentences.
The platform’s user interface is reminiscent of many search engines, with a text bar allowing users to enter phrases that serve as instructions to generate the original images. Within 30 seconds of the user pressing Enter, Half a dozen the rendered images appear on the screen. The content of the images varies slightly from image to image, with some demonstrating a literal interpretation of the search phrase while others explore the implied meanings of the search words. The truly remarkable ability to interpret strings of words in multiple ways demonstrates an inventive level of textual understanding that feels incredibly human for an AI. The platform website advertises many of its most impressive abilities, such as: “creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images”. These descriptions only scratch the surface of what DALL-E is capable of, but OpenAI has already moved beyond that first program in a quest to code something even closer to sentient life.
DALL-E was soon followed by DALL-E 2, a similar application which performs almost the same function but displays sharper images and has a more advanced understanding syntax of the English language. Neither app is available for public use, with the latter being in beta testing and being made available to select online personalities to advertise its features. It’s unclear when or if the platforms will be released for general use, although it seems likely that they would exist behind a paywall if a public version were developed. The lack of general knowledge regarding the full functionality of the program or its technical basis has left many speculating about the code that powers the two applications, although the OpenAI website provides a wealth of knowledge about certain components of their inner workings. .
Since its inception in the 1940s, digital computer technology has been able to interpret human input and produce a desired response, usually in text form. When a search engine or website is asked to display an image, such as on Google Images, it does so by retrieve an existing file which it understands to be related to search terms via machine learning processes. DALL-E is built on the Generative Pre-trained Transformer 3 (GPT-3) framework, a language algorithm that learns to predict and generate sequences of text. The platform takes this coding model and expands on it, hosting its own database of reference images in a way reminiscent of a search engine. It leverages GPT-3 to recognize word order and meaning and to scan multiple images associated with different words in a search. Once it understands the input vocabulary string using these references, it can then generate an original image by combining the disparate content into the search phrase.
There are countless reasons to praise the minds behind DALL-E for concocting a creative tool that has such a high understanding of language and visual art, although there are also reasons to be concerned. The art world immediately became concerned about a market in which artificial intelligence can drive living artists out of their jobs. The frantic talk around DALL-E makes sense for those concerned about their careers, even if it’s not the first time visual artists have been threatened by, but ultimately survived, the march of technology. Photography was also once a dreaded new medium, the ease of capturing actual images seeming to challenge the job security of portrait painters and impressionist painters. Although the medium may have replaced the demand for painted artwork, classic forms of visual art survived the age of cameras as photography was a separate sector of the art world and was often used by painters to inspire their work. Open AI stated goal to develop the DALL-E programs is to help graphic designers by giving them a tool to quickly generate reference images that can be used in many ways for more art. The ability to generate reference images quickly and in a style the artist may not have envisioned is an incredible asset to those learning to use it and will likely bring more to artists than it intended. will get out of it.
The impressive technology at play within DALL-E presents another ethical dilemma. The significant difference between a sentient artist and a robotic curator is the presence of a moral compass within the former. DALL-E can render photorealistic visuals and could hypothetically be asked to depict harmful content without much user involvement. In anticipation of such circumstances, the AI refuses to generate images using violent or explicit search terms and will also avoid producing visuals containing public figures. These rulings preemptively circumvented some forms of abuse of the technology, though astute users could search for specific, uncensored terms to generate images that approximate what the program would refuse to describe with censored terminology. It’s easy to blame DALL-E for this flaw, although the user remains the driving force behind any improper work done by the application. Human artists have also shown tendencies to produce despicable art without the marvels of 21st century technology, as demonstrated by many propaganda artists of centuries past. Any method of communication can be channeled for dubious purposes, but it is unreasonable to blame the tool for a problem that is the direct responsibility of its user.
Although the name of the platform refers to Dali, it is worth examining the difference between the program and the painter to allay the concerns of those who find DALL-E and its successor dangerous. Salvador Dali was an eccentric abstractionist painter who was instrumental in the 20th century shift from impressionist painting to postmodern art. His incredibly stylized work is instantly recognizable and the product of his ingenuity; his brush gave rise to contours and compositions that no one had imagined before. DALL-E, on the other hand, can only imitate, and its ability to create new styles or shapes beyond what exists in its database of visuals is limited. The program cannot follow in Dali’s footsteps and take the next quantum leap in artistic thought in the same way budding artists today undoubtedly will. Whether or not it is used to create, imitate or outright copy a style or form, it always takes a creative mind to get behind the wheel and steer it in a certain direction. DALL-E doesn’t need to sound the alarm for a war on technology, but rather, it reminds us that even as artificial intelligence advances, we can recognize it as an extension of ourselves.