Ep8: Inside VideoGen - The AI Startup Simplifying Video Creation
at Videogen we try to make it fast and easy for anyone to create videos.
It's a very simple problem.
Video creation is hard.
and we try to make it easy.
our most popular workflow is script to video, where you start with a script or you just
start with a prompt.
And then we'll automate every single step of that process to take you from script to video
in typically under a minute.
So this was very surprising.
We have some uh users that are visually impaired actually they want to create videos to
produce content on social media, ah but they can't use a traditional video editor
our goal is always creating the best experience for our customers.
A little bit of a cliche answer, our customers, want better videos, faster videos, more
videos.
Today's guest is Anton Koenig, CEO and co-founder of Videogen, a YC-backed startup that is
building an AI-powered platform for video creation.
Let's dive in.
Anton, can you tell me what you're building and what is that big problem that you're
trying to solve?
Yeah, so at Videogen we try to make it fast and easy for anyone to create videos.
It's a very simple problem.
Video creation is hard.
and we try to make it easy.
So we use all the latest technology, AI, our applications on the browser, it's
collaborative.
So we make it easy for individuals as well as large teams to create videos across
marketing, training, education, and a long list of other use cases.
um And we've been doing that across...
know, small creators, small and medium sized businesses, and even large enterprises who
are creating at scale.
Yeah, there are several video generation tools in the market today.
What makes video GN unique and different from the competition?
Yeah, so there's a lot that goes into every single platform.
Video is a very complex medium.
So you're dealing with four dimensions, time, space.
uh And uh basically, we automate the entire workflow end to end.
And we have multiple workflows that cater to very specific use cases.
So our most popular workflow is script to video, where you start with a script or you just
start with a prompt.
And then we'll automate every single step of that process to take you from script to video
in typically under a minute.
A lot of other like AI video editing platforms will have a lot of features, but they won't
offer those end to end workflows like we do.
So basically you explain the video in natural language, you have the script and then
VideoGen basically generates that video end end and I'm assuming you would need to do
minimum reruns.
Is that right?
So this depends on the use case.
ah Typically, if a user is doing something like uh creating videos from blog posts or from
existing content, they won't need to do that many reruns because the script is very
refined and they're typically just trying to repurpose existing text content.
ah
So those types of users tend to create in bulk.
They might be creating dozens of videos a week.
And they will take the output, maybe add a few final touches to it, like their logo or
slightly changing one asset here or there.
But then we do have some users that...
basically use that video as like a first draft and they might spend, you know, two hours
afterwards actually fully editing it into like a client ready promotional video, for
example.
So it really varies.
And I think both use cases are valid.
It just depends on kind of the budget of the project, how many videos you need to create
ah and you how much time you have on your hands.
Yeah, I hear you.
Interesting.
So what are some of the most common or, let's put it this way, most interesting use cases
that you've seen for the video jam?
uh So definitely marketing tends to be the most common a lot of marketers b2b and and
consumer marketers are Realizing that video is really important in today's day and age
People aren't reading uh Text anymore.
They're not even reading slide decks or blog posts So companies are starting to post on
YouTube or they're starting to post on short form platforms as well and They're typically
doing a similar strategy that you might have done
on blogs previously.
So you're writing relevant content for your customers.
You're talking about maybe recent news stories or you're giving advice or sharing like
knowledge.
uh But now instead of doing that as a blog, marketers are doing that as a video and then
they're posting it weekly or sometimes even daily to YouTube or TikTok or another
platform.
who surprised you the most as maybe one of the early power users?
I know you mentioned marketers, but maybe there was kind of a surprising group of, or
roles, whether it's educators or maybe enterprise customers.
Like who was it?
So this was very surprising.
We have some uh users that are visually impaired actually ah that they want to create
videos to uh produce content on social media, ah but they can't use a traditional video
editor
because their screen reading software is unable to interact with these complex timelines.
uh
So with Videogen, they can actually create videos with uh limited vision or sometimes
being fully blind.
So that's something that we did not expect going into creating Videogen, but it's
definitely lowering the barrier to entry for creation.
And there's people that traditionally would not have been able to create videos that are
now creating a lot of videos.
that's very good to hear.
Now, if we talk about the technology behind the product, can you describe the technology
so the technology, again, it's gonna depend on the specific workflow that you're using,
which is one of the reasons why Videogen is kind of unique to every user because most
users will find a workflow that they really like and then they'll just keep using that one
and might not interact with the others.
But script to video, it's basically managing the entire video editing process behind the
scenes.
So it's going to go out and source stock footage.
uh It can generate AI footage where it needs to.
It will add captions and then animate those captions.
It will generate text to speech, like realistic sounding voiceovers.
It can generate avatars.
So we're taking all of these uh media uh production models, like avatar models, text to
speech models, and we're combining it into a system that can orchestrate everything and
essentially direct
to your video end to end.
uh what we saw users doing basically before, like users that were previously trying to
create like content in bulk but haven't used VideoGen.
They're basically doing that themselves.
They're going on 20 different platforms generating individual pieces of content like a
text-to-speech voiceover or a video uh from like a video model.
And then they take all of that and they have to edit it manually together into an editor.
So we're taking all of those different platforms and then collapsing it into one interface
ah where
We're going to make sure that you have the highest quality and the fastest model for every
single use case.
We're going to handle all of those retries where ah if the quality isn't good, you can
easily just regenerate it.
So it's hard to point to one very specific technology.
It's really the combination of everything, which makes it powerful for our users.
In terms of some of the key features, any specific ones you would like to highlight?
Yeah, one of my favorite features is uh auto replace.
It's a very simple feature.
uh If you have a stock uh asset that the AI has picked for you, so stock video or stock
image, you just click auto replace and then it goes and quickly finds another one for you
that's relevant.
So this is one of our first features.
When we started, like this was even before AI video generation, like the image models or
video models were actually good.
So the tedious time process
that we were focused on was just having to switch between going on your video editor and
then going to a stock media site and searching and then going back and forth and back and
forth.
ah And it's just a, it's one of those features that's a time saver.
It's a small time saver.
It's a small little convenience, but by having that button right there, you can just click
it and then you can move on to the next editing task.
So it speeds up your workflow.
similar to this auto replace feature, we have that for uh like generative AI media.
So if I have an image, uh I can essentially prompt to edit that image.
So, you know, this technology is super new, but it's also very, very powerful.
For example, if you have an image that contains some text and you want to translate that
text to Spanish per se, I can just say, know, translate the text to Spanish.
The AI is actually going to edit the pixels of the image and, ah you know, turn it into
like with the same font and everything and it's going to make sure that the layout is
correct.
So this is one of those features that traditionally you have to string together a ton of
models, you need to try it a ton of times.
But in VideoGen, you literally just click once, type, okay, make it Spanish, and then
everything's done for you.
So translation is something that with enterprises or media companies that use VideoGen,
It's one of the most time consuming things of video production because if they have a
global audience, every single video might have to be in 10 or even 20 languages.
So ah that's kind of one of the biggest unlocks for some of our larger customers.
Now, switching gears into market traction, what are some of the key metrics that you pay
close attention to and how have those metrics evolved so far?
Yeah, so I can't get too specific with our metrics, but we generally look at uh revenue,
uh monthly recurring revenue, like a typical uh SaaS company.
And then we also look at retention, and we break down retention by certain features.
So we try to see activation and repeated usage on workflows or features that are unique to
Videogen.
uh
Yeah, for example, that auto replace feature is one feature where we've seen users that
use it once kind of continue to use it.
um And typically with any feature like.
Now, the first time someone uses it, maybe there's a 50 % chance they use it again, but
then if they use it twice, then there's a higher chance that they'll keep using it.
So every primary feature, we have a curve of, how retentive is this?
How useful is it to our customers?
ah And ah we try to do that across all the key features.
It's just move that curve up.
Does that metric also tell you what future potential features you guys would need to build
for customers?
Definitely.
Most of our features are, there's sort of like one part of a, they're like an appetizer to
like a whole suite of features.
So, you know, for example, going back to that auto replace feature, you know, now that has
expanded to have like an easy way for you to edit the image in place or for you to
automatically regenerate a video.
So all of these features where,
We find that our users tend to like those features where you can just click one button and
then let the AI do some work for you and then work on something else and then go back.
And just by seeing those metrics, that's informed a lot of our products.
I always love this question and love to hear the answer to the question.
ah How did you guys lend your first customer?
What detailed steps you took to lend that customer?
um So our first customer, surprisingly we got a few customers in our first week of launch.
We launched almost three years ago now.
So this was, I think about six months after ChaiGBT, AI was still a bit of a new thing in
most people's minds and AI video is definitely very new.
So.
We spent six to eight months actually working on the first version.
ah I guess, contrary to some other startups, just getting like the video platform working
and getting, uh you know, video in the web ah is very like, it's very technically hard
challenge.
So we spent a lot of time getting it working reliably and making sure that it produces
like actually good results regularly.
Mm-hmm.
And then for launch, we basically went on a bunch of listing sites, like these startup
launch sites where you just say, hey, this is my startup.
I would love for you to check it out.
ah We didn't do product hunt, but we did like some other launch sites, like similar to
product hunt.
I posted on Reddit.
um
I I responded to some people on Quora who were asking about video platforms.
So we basically in one day, like my co-founder and I just sat down and grinded out a ton
of different listings.
And then pretty soon we got like a few thousand users visiting the site and a few
customers decided to convert and pay.
Now, talking about the customers and expansion, how do you guys plan to put the product
into the hands of millions of users going forward?
What channels are you guys focused the most?
Yeah, uh one of our kind of new up and coming channels is definitely uh answer engine
optimization.
uh So you're trying to make pages that AI chatbots are going to cite.
uh
That's a new channel, it's very similar to SEO.
But generally, we try to target people when they're very high intent, when they're clearly
interested in video creation.
So they're either searching it on Google or they're asking a chatbot about it.
And we just try to provide helpful content.
uh So whether that's a video or a blog post.
we're producing content that will hopefully get cited by these different platforms.
Makes sense.
Yeah.
Now, in terms of challenges, what has been the biggest challenge so far for the company?
The biggest challenge, think, definitely recruiting people is just a very hard challenge.
That's one of those things that doesn't get easier with technology.
uh Luckily, we've seen a lot of the challenges that we used to have, like trying to
produce high quality videos.
ah
Those have gotten better as the AI models have gotten better.
So we're able to create smarter agents and we're able to generate higher quality media.
hiring people is always a challenge, especially as a seed stage company.
We don't have kind of the resources to necessarily get the top people right away.
So it takes a lot of convincing.
Yeah.
Now on that topic, what are some of the key characteristics that you look for in
candidates?
um I wish I had a simple answer for that.
think we're still figuring that out.
I definitely think like curiosity is important.
ah Honesty is very important.
Like people that are not just honest to other people, but honest to themselves that can
admit when they're wrong ah and are willing to kind of try new ideas.
um
Definitely like uh autonomy is important.
Given we're such a small team, we're only six people, someone that we bring on has to be
able to deal with like ambiguous scope.
um those are kind of the characteristics we look for.
um And sometimes you can tell that based on like.
previous work experience, like if they've worked at a small company before and if they
like that, then that's usually like a good sign for us.
Yeah, that makes sense.
Now, let me take you back.
Can you talk about your background as well as your co-founder's background, education,
prior roles, et cetera, and how did you guys come together to build VideoGen?
Sure.
My co-founder and I, actually are friends since middle school.
So we were doubles partners together.
We grew up in the East Coast and the Boston area.
And at the time, we were kind of the only uh kids interested in web development ah and I
think just the internet and YouTube and uh multimedia creation in general.
uh
He was building web apps just for fun, launching a bunch of projects.
And I was doing freelance web design and freelance video editing.
So we did a few projects together, nothing like a startup, but we just launched some
things for fun.
And we knew that we wanted to work on something more serious when we got older.
I guess, yeah, fast forward to 2022.
He was, we were both studying computer science.
He was at Brown, was at UMass Amherst.
And we both did kind of the big tech internships.
uh Got to see kind of engineering at scale and what that looks like and got to work with
some kind of very talented engineers.
I think through that we realized that...
We wanted to create our own thing ah that we didn't want to work for these massive
companies.
As cool of an experience it is, kind of, you can feel like a cog in the machine sometimes.
So we set out to do our own thing.
He had worked at some video editing companies before and I had been doing kind of
freelance video editing uh as a kid.
So we knew a lot about video ah and then it was actually right after the launch of
ChatGBT.
we were kind of working on something completely unrelated to video gen, uh, which we never
launched it, but, uh, we were using, uh, like Bart, which was like Google's LLM before
chat GBT, which was not super good.
Um, but we were trying to like, we saw some potential, but we didn't really know what it
was useful for.
And then ChetGBT launched our GPT-3, and we realized that we could use it to automate a
lot of tedious aspects.
So the first one for us being sourcing stock footage, like searching for stock footage for
you, finding the best one, and then clipping it in place.
um And that was kind of the original technical insight.
um I think, I mean, we tried so many ideas, like before Videogen.
and we were trying to find something that was something that we were knowledgeable on,
something that we felt like was valuable, and then also we wanted to do something that was
kind of fun and interesting.
ah I think like video, once we realized, okay, there's an opportunity to do something cool
with video, ah we really stuck with it and we spent like six months working on that first
version.
Now, in terms of your like undergrads, are you guys both technical and did you guys go to
technical schools?
Yeah, so I was studying computer science.
My co-founder was studying computer science and mathematics.
yeah, we were both, I mean, we've been coding for a long time.
ah I think I've always been more of like a front end developer.
So I think I approach it from like kind of a designer's perspective.
And he's very much, you know, has like systems thinking, you know, very talented.
um And yeah, I think we have like a good dynamic.
Now I know you guys went through YC uh school and in that regards, two questions I guess.
One, uh what would be the advice for other founders to get into YC?
I know it's very competitive.
And then what was your biggest takeaway from that experience?
Yeah, it's hard to give kind of uh broad advice for getting in because I know everyone's
experience is pretty different.
uh also my impression before uh was quite different from after going in.
ah Just kind of the diversity of like different types of people, different types of
startups.
ah You know, yeah, it's kind of it's very case by case, right?
Like some people are.
PhDs who discovered something in research and now they want to spin it out into a company.
Other people like ourselves are...
kind of new grads or myself, I'm a college dropout now.
And we're just kind of hacking things together and we have an idea that we're passionate
about.
And then there are others who are kind of these like industry professionals.
They have, you know, a dozen years of experience under their belt.
They have a team that they know they want to hire already and they just need some funding.
So.
Yeah, if you're in my shoes, I mean, I think the best thing to do is probably to try to
get some revenue as soon as possible and, ah you know, building something that you not
just as a cool, but something that you can prove people are willing to pay for, I think is
very important ah for us.
Definitely.
That was the thing that we were.
We spent almost a year building the company before YC.
So we had a lot of traction, we had a lot of revenue.
ah
And yeah, I think that comes from solving a problem that is really dire.
So in our case, people, you know, creating videos is super tedious.
It takes hours, sometimes days, but it's also a super valuable task because for marketers,
sometimes, you know, videos, what moves the needle for getting more leads, you know, for
training and education, video is like the end output that you need to create to oh
uh A lot of schools now are moving online with COVID and then after COVID, there's a lot
of remnants of now these online curriculums.
So m yeah, they have to create content across like 20 different languages.
It's a whole process.
So yeah, I think identifying a problem that is real and then being that people are willing
to pay for it, I think that's the most important.
Nice, makes sense.
then in terms of like the biggest takeaway from YCE, what would that be for you?
Yeah, sorry, biggest takeaway.
um I mean, I think the biggest takeaway is that, I don't know, there's just a lot of
different, there's just a lot of things out there.
um Yeah, I mean, it's also like coming out of college, right?
Like, you just realize, wow, the world is much bigger than college.
I think, yeah, taking away seeing how
you know, different people uh can be so talented at very different things.
I is a big takeaway.
uh Seeing how like, yeah, every field just has so much depth to it.
And, you know, like a true expert in their field is like.
Yeah, there's just like a whole world in every single little niche.
And there's so many problems that can be solved if you're building startups or building
technology.
So I think it's exciting.
Yeah, hopefully also inspiring to get to people that want to create a startup.
Now, if I look into the future next three to five years, what is the vision for VideoGen?
What is that North Star for the company?
Yeah, I think our goal is always creating the best experience for our customers.
A little bit of a cliche answer, but our customers, want better videos, faster videos,
more videos.
So we're going to continue delivering that.
We're going to deliver that more for our existing customers.
And then also we're going to try to expand to new customers by creating more industry
specific workflows.
So.
That's always gonna be like our North Star, uh I guess broader, like broader than VideoGen
itself.
In the next like five years, we wanna expand past just video.
A lot of the technology that we're developing in video is actually one of the hardest
mediums to produce because you're dealing with this timeline and it's very like heavy.
The files are very large.
You have to serve it over the web.
So all of the challenges that we're solving technically and all the user experience
challenges, they translate to other forms like slideshow creation, even website creation.
uh
or just graphics creation.
So we do want to build like a true creative platform where you can not just create videos,
but you can create any type of medium and essentially like, you know, turn your ideas into
consumable content that, you know, is shareable to every single person.
Like, you know, if I have an idea, if I want to express something, I should just be able
to give it to an AI and the AI is going to package that for every single type of person in
every single language, in every single content format.
So there's a lot of exciting stuff.
It's just kind of a matter of time until like, know, the AI needs to become smarter and
smarter for us to be able to build these technologies.
Yeah, that's pretty exciting mission.
uh That being said, I know we went through the product, the company, the background and
future vision as a final thought.
Is there anything else you'd like to share with the audience?
Yeah, I think the two things would be uh one that we're hiring.
ah We are hiring designers, marketers, engineers.
So I mean, the roles will be uh probably still open by the time this podcast launches.
So check out our job board videogen.io.
And if you're a builder, we have an API.
So a few customers already building on top of it, building like very like vertical
specific video applications.
um
We have this platform, but it's almost impossible to serve every single type of video and
every single customer because there's so many different needs out there.
So some of our customers with the APR are building for a hyper-specific use case, and
they're solving a problem that's more valuable than what we can solve.
Well, Anton, thank you so much for coming to the show.
I hope we will meet sometime soon again and you will have even bigger success stories to
share.
