Build an AI startup in 5 minutes with Chat-GPT as your coach! You won’t believe the results!

10 min readDec 22, 2022

Case study : Use Ghat-GPT to explain to me What is GPT and start to build a startup on it in less than 5 minutes! 🤯

My background:

I have been playing with GPT3 from OpenAI for a while now as an early adopter. It was fun at first to impress a few friends but learning how to use it for work is a lot more rewarding. I hope this article will open the door to many possible uses, especially for founders out there.

Let’s start by briefly sharing what I know about how this black magic happens. GPT stands for Generative Pre-trained Transformer 3.

Since I played a while back with another type of machine learning called RNN (recurrent neural network) like on (https://www.youtube.com/watch?v=Jkkjy7dVdaY), I thought it would be an excellent place to start learning about how GPT really works.

If all of this still looks like black magic to you / I would recommend to watch on basic neural network mechanisms visualized : https://www.youtube.com/watch?v=aircAruvnKk and if you are commited you can read for free https://www.deeplearningbook.org/ (a book written by one of the guy behind OpenAI: Ian Goodfellow)

This will take some of the black magic away. But at the end of the day, a multi-billion dimensions tensors speaking back to me still feels like magic especially knowing all it took are a bunch of layers, weight, and kinda simple functions…

A word of caution, Chat-GPT is a master bullshitter.

Back to asking chat GPT to help me understand what GPT is… A quick note, the novelty of chat GPT over GPT is its “memory” of past conversations. A danger of GPT is that it can go complete bs to the point of inventing fake quotes, fake books, and phony authors / so the first thing to keep in mind is to always double-check what it says (in a way, it’s like a brilliant friend who hates being wrong to the point of bullshitting us!)

Teach me what GPT is based on my background:

Let’s go to https://chat.openai.com and login

Dear Chat-GPT Explain to me what is chat-GPT?

My input:

“Hey, I did build a text generator that i trained on a RNN, I get the basic math and ideas behind neural networks but I’d like to know in details what makes GPT-3 different from a RNN and what it can do better. Oh and I will share your answer to a public not familiar with neural networks and tensor math so keep it simple”

its answer:

GPT-3 (Generative Pre-training Transformer 3) is a large and powerful language processing neural network developed by OpenAI. It is a type of transformer-based model, which is a type of neural network architecture that was introduced in the paper “Attention is All You Need” by Vaswani et al. in 2017.
One key difference between GPT-3 and a traditional recurrent neural network (RNN) is the way they process input data. RNNs process input sequentially, meaning they process one element of the input at a time and use the output from the previous element to inform the processing of the current element. This makes RNNs well-suited for tasks that involve sequential data, such as language translation or language modeling.
On the other hand, GPT-3 uses a transformer-based architecture, which processes input in parallel using self-attention mechanisms. This allows GPT-3 to process input more efficiently and effectively, particularly for tasks that involve long-range dependencies, such as language generation and language understanding.
GPT-3 has several capabilities that make it a powerful tool for natural language processing tasks. It can perform tasks such as language translation, summarization, question answering, and language generation with high accuracy and efficiency. It can also learn to perform new tasks without explicit supervision, making it a versatile tool for a wide range of applications.
Overall, GPT-3 is a powerful and advanced language processing neural network that has the ability to perform a wide range of natural language processing tasks with high accuracy and efficiency. It is particularly well-suited for tasks that involve long-range dependencies and can learn to perform new tasks without explicit supervision.

I want to be sure I got it / and asking for how it works and how i can learn by doing:

First to be safe I search the above paper to see if it’s real / and to read it later:

hooray this is real: https://arxiv.org/pdf/1706.03762.pdf (great paper by the way!)

Let’s check i got what GPT is based on its answer.

“Would it make sense to say that GPT key innovation is the use of attention mechanism? What is the math behind it and how can I play with it using python on my Mac?”

Chat-GPT

Yes, the use of attention mechanisms is one of the key innovations of GPT-3 and other transformer-based models. Attention mechanisms allow the model to selectively focus on different parts of the input when processing it, rather than processing the entire input sequentially. This allows the model to effectively handle tasks that involve long-range dependencies, such as language generation and language understanding.
The math behind attention mechanisms is based on the concept of dot products, which are a type of scalar product (also known as a inner product) of two vectors. In the context of attention mechanisms, the dot product is used to calculate the similarity between a query vector and a key vector. The query and key vectors are typically the outputs of separate linear transformations of the input data, and the dot product is used to calculate a scalar weight for each element of the input. These weights are then used to compute a weighted sum of the input elements, which is used as the output of the attention mechanism.
You can use Python to implement attention mechanisms and experiment with them on your own. There are several popular libraries and frameworks that you can use for this purpose, including PyTorch, TensorFlow, and Keras. These libraries provide a range of tools and APIs that you can use to build and train neural network models, including models that use attention mechanisms.
If you are new to neural networks and want to learn more about how to implement and work with them in Python, there are many resources available online that can help you get started. For example, you can find tutorials and examples on the websites of the libraries and frameworks mentioned above, or you can search for online courses and tutorials that cover these topics in more detail.

Let’s challenge our AI teacher:

I don’t see the link between a dot product and attention, how does that work? Give me an example of 2 vectors and how attention emerge from the use of a dot product in a GPT model on language…

And within seconds it boils down how a dot product can bring “attention” to a neural network 😮

In the context of attention mechanisms, the dot product is used to calculate the similarity between a query vector and a key vector. The query and key vectors are typically the outputs of separate linear transformations of the input data, and the dot product is used to calculate a scalar weight for each element of the input.
For example, consider a transformer-based language model that is processing a sentence. The input to the model is a sequence of word embeddings, and the goal is to predict the next word in the sequence. To do this, the model first applies linear transformations to the input word embeddings to create query and key vectors. The dot product of the query vector and each key vector is then calculated, and the resulting scalar values are used as weights to compute a weighted sum of the input word embeddings. This weighted sum is then used as the output of the attention mechanism, and it is passed through additional layers of the model to produce a prediction for the next word in the sequence.
In this way, the attention mechanism allows the model to selectively focus on different parts of the input when processing it, rather than processing the entire input sequentially. This can be particularly useful for tasks that involve long-range dependencies, such as language generation and language understanding, as it allows the model to effectively capture relationships between words that are far apart in the input sequence.
I hope this helps to clarify how the dot product is used in the context of attention mechanisms in transformer-based models like GPT-3. Let me know if you have any other questions!

Now that I’ve learned more than enough let’s run some code!

Help me write a python script that can run a transformer offline on my mac to show off my friends i understand how GPT works :)

After manually installing Pytorch and Transformers I ran the script it generated for me :

import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

# Download a pre-trained transformer model
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Use a tokenizer to encode the input and decode the model's output
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')


def generate_text(prompt):
  # Encode the prompt and convert it to a tensor
  input_ids = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
  input_ids = input_ids.to('cuda') if torch.cuda.is_available() else input_ids

  # Create an attention mask
  attention_mask = input_ids.ne(0).type(torch.long)

  # Generate text
  output = model.generate(input_ids, attention_mask=attention_mask, max_length=200, top_k=10, top_p=0.9, temperature=0.7)

  # Decode the generated text
  generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

  # Remove the initial prompt and add a period if the generated text does not end with one
  generated_text = generated_text[len(prompt):]
  if not generated_text.endswith("."):
    generated_text += "."

  # Split the generated text into sentences and return the first sentence
  sentences = generated_text.split(".")
  return sentences[0] + ".\n"



# Start the conversation
print("Welcome to the transformer chatbot! Type 'exit' to end the conversation.")
while True:
  # Get input from the user
  user_input = input("\nEnter your message: ")
  if user_input.lower() == 'exit':
    break

  # Generate a response
  response = generate_text(user_input)
  print(response)

Now let’s ask for a startup idea using my new AI / GPT dev skills!

I first asked for general ideas : it gave me 2: A text completion service for writers and journalists or a custom chatbot.

Nothing very original but it inspired me to build a service that generates great clickbait titles.

The birth of TitleGenie the clickbait generator :)

Sure! Here is a startup idea for a text completion service for journalists to generate clickbait titles:
The service could be called “TitleGenie” and it could be a web-based tool that uses a fine-tuned transformer model to generate catchy, attention-grabbing titles for articles and blog posts. The service could be aimed at journalists and content creators who want to generate more traffic and engagement for their online articles.
To fine-tune the transformer model for this purpose, you could use a large dataset of clickbait titles and article subjects as training data. This dataset could include headlines from news articles, blog posts, and other online content that is designed to grab the reader’s attention and entice them to click on the article. You could use this dataset to fine-tune the model to predict the next word or phrase in a given title prompt, and to generate titles that are both catchy and relevant to the article subject.
Once the model is fine-tuned, you could publish it as a web service that users can access through a subscription-based interface. Users could input a subject or keyword for their article, and the service would generate a list of potential titles for them to choose from. You could also offer additional features such as the ability to customize the tone and style of the generated titles, or to generate titles in different languages.
I hope this gives you some ideas for how to fine-tune a transformer model and monetize it through a subscription-based web service! Let me know if you have any questions or need further assistance.

Let’s write the MVP of a ClickBait generator

import openai
import pandas as pd

# Set up the OpenAI API client
openai.api_key = "YOUR_API_KEY"

# Load the PPC data into a Pandas DataFrame
data = pd.read_csv("ppc_data.csv")

# Define a function to generate titles
def generate_title(prompt):
  # Use the OpenAI API to generate a title
  response = openai.Completion.create(engine="text-davinci-002", prompt=prompt, temperature=0.5, max_tokens=50)
  
  # Extract the generated title from the response
  generated_title = response["choices"][0]["text"]
  
  # Remove the prompt from the generated title and add a period if it does not end with one
  generated_title = generated_title[len(prompt):]
  if not generated_title.endswith("."):
    generated_title += "."
  
  # Use the machine learning model to predict the performance of the generated title
  prediction = model.predict(generated_title)
  
  # If the predicted conversion rate is above a certain threshold and the predicted cost is below a certain threshold, return the generated title
  if prediction["conversion_rate"] > 0.8 and prediction["cost"] < 10:
    return generated_title
  else:
    return None

# Test the function by generating a title for a sample article
prompt = "10 Reasons Why Dogs Are the Best Pets"
title = generate_title(prompt)
print(title)

Oh and by the way my original title for this post was : “How I started coding an AI startup in 5 minutes with Chat-GPT as my startup coach, and how it can become your everything coach!” I asked chat GPT to make it more “clickbaity”

bellow is the full convo that lead to this Blog Post title…

Conclusion: Chat-GPT as an everything coach

We live in a wonderful time, and sure Chat-GPT is not perfect, a bunch of dot products on a gigantic tensor of 175b+ parameters is not the magic answer to everything. But we are getting there, one part of our brain at a time. This is now and it will only get better in the next decade.

What I did here can be applied to anything, from learning a new skill to just get inspired … Our brain will do the rest (for now) Feel free to comment what you are planning to do! And also please share (for the few who will read this and find it useful!)

Like one of my favorite youtube channel often says… Hold on to your papers! What a time to be alive!