Skip to main content

Seaman creator Yoot Saito on the fishy Dreamcast AI that was way ahead of its time

‘A concept that was universally strange for men and women.’

Share this story

Sega’s Dreamcast was ultimately a failure, as Sony came to dominate the early-2000s market with the PlayStation 2. But Sega’s machine left behind a library of uniquely innovative and influential software. And perhaps no title was as memorable as Yutaka “Yoot” Saito’s iconoclastic Seaman, a virtual pet simulator that had you use a microphone to converse with a moody, sarcastic man-fish, with help from a narrator voiced by Leonard Nimoy.

Yoot Saito was previously best known as a Mac-focused developer who created The Tower, a tower management game that SimCity studio Maxis released worldwide as SimTower. (Its sequel, The Tower II, came to be known as Yoot Tower outside Japan.) After Seaman and its PS2 sequel, Saito designed Odama, a pinball-strategy hybrid for the GameCube where players used voice commands to direct soldiers on the table.

This summer marks 20 years since Seaman’s release, so I got in touch with Saito over email to discuss its development and legacy. His responses were so in-depth, with various passionate tangents into subjects like video game history and linguistics, that I decided to reproduce them here with minimal editing.

This interview has been lightly edited for clarity.

On how Seaman came to be after SimTower

“If you are just separating out the themes, one is a game about building skyscrapers and the other is about raising a human-faced fish, so they do look completely different. [But] if you trace my starting point in the game industry, it was because the masterpiece that is SimCity had a profound effect on me and led to me wanting to make games. So my core interest back then and even now is simulation games.

If I were to be even more specific, I would say I have a huge interest in communication and language. For me the most basic aspect of a computer game is building a new world as a game designer via the ‘language’ that is the game. In order to do that, you need to remove the unnecessary real-world pieces and break down what you choose to keep as key game features or elements that can be interacted with as items or via game commands. That’s the core backbone of simulation games.

So to that point, Seaman is a game where the challenge was to use the filter of language to bridge the gap between gamer and non-gamer. Trying to turn actual language into a language element / filter for games is incredibly challenging, but it’s actually like converting natural language expressions into playing cards and playing a card game with another person.

“‘Seaman’ is actually like playing poker.”

That means Seaman is actually like playing poker. The keywords are the ‘cards’ that are played back and forth between the end-user and the game character (Seaman). Most gamers wouldn’t realize it, but Seaman is a game with a lot of push-pull game elements in the same way as poker. That back-and-forth is a driving mechanic of the game. Just as in SimTower when you add a security room in the skyscraper, giving Seaman a different keyword leads to an entirely new conversational path, and perhaps that is the easiest way to convey this concept.

I conceived the idea of Seaman at lunch one day. I was talking with a developer working with me on a new Tower series game. We had multiple projects up and running, and one of the teams was working on a simulation game where you raised tropical fish in an aquarium. We were just joking around but I said, ‘If I was gonna make a game like this, I’d do something crazy!’ to which my co-worker responded, ‘Yeah, like what specifically?’ I felt like I was on the spot and had to say something quickly. 

‘I’d have the camera face the user and have the pet speak in human words. Something like, ‘Who was that woman you brought here yesterday?’ A pet like that would be great.’ A different co-worker said, ‘That would be really strange,’ to which I said ‘Yeah, strange and great!’ Of course we couldn’t call it Sea Monkey… we have to call it Seaman! And that went on to be the project codename and final name of the actual game.

Everyone laughed at the ridiculous nature of the idea, but that weekend I told my wife about it and she found it strange and great, too. So it was a concept that was universally strange for men and women. And then she said, ‘That’s a really strange and gross concept so you should definitely make it!’ And that’s when I realized even women could enjoy and appreciate a game that was strange and gross. Those words helped make my decision to seriously do the game.”

Yoot Saito.
Yoot Saito.

On the influence of ‘90s virtual pets and Shigeru Miyamoto

“When you say virtual pets were popular in the ‘90s, I assume you mean Tamagotchi, but I wasn’t super interested in that trend. Because, as I mentioned before, I like complex simulation games. The more complex the theme, the more it is something no other creator had done, the more it interested me. So I didn’t want to follow someone else’s trend.

Most game creators find a winning recipe in a popular game and chance a few elements or design pieces, story, etc., and then call that creating a game. But I don’t understand that mindset. Not focusing on sequels and wanting to make something really strange and new, or you could say, something that most probably won’t make money is pretty rare. I only met one other person that could understand my base feeling and that is Nintendo’s Shigeru Miyamoto.

Miyamoto-san really likes strange and different concepts. So while Seaman was released on Dreamcast, behind the scenes we came very close to choosing to develop the game on the Nintendo 64DD. As a matter of fact, around the time Seaman was released on Nintendo’s rival hardware, the Dreamcast, Miyamoto-san was featured in a magazine wearing a Seaman shirt. Seaman was eventually released by Sega, but Miyamoto-san was an incredibly important person in the process of getting Seaman made, especially for someone like myself with very little console game experience.

Also, although Sony’s robot dog Aibo was released around the same time as Seaman, neither product was influenced by the other. Coincidentally enough, I did have conversations about a bipedal robot from multiple companies. Although it’s been 20 years since Seaman saw the light of day, I’m making an AI engine capable of speaking without needing preexisting scenarios or text strings. It’s been a long path to get here. Inventing this has taken a lot of time and used up a significant amount of different investments, but if you ask me it’s leagues more interesting than just developing a normal game and certainly more thrilling.”

How Seaman came to be on the Dreamcast

“Actually, a man named Irimajiri-san, who was the vice president at Sega at the time and later went on to become the president, was the one who led to Seaman being released on the Dreamcast. A different creator called Kenji Eno-san told me he had someone he really wanted me to meet and introduced me to Irimajiri-san.

Back then the Dreamcast was still being referred to by its codename Katana and was a new type of game console. Irimajiri-san was previously the president of Honda in the US and felt that Sega needed to focus on new types of gamers. It was the time when Sony was king and so I decided it would be interesting to be in the market leader from a lower position and joined Sega. Altering a prototype built on Mac to a new consumer hardware was quite challenging, but we were able to do it in about a year and a half. It’s thanks to all the hard work of the team back then that we were able to achieve this goal.”

The Dreamcast microphone.
The Dreamcast microphone.

On the Dreamcast hardware

“The Dreamcast microphone was developed by the Sega peripheral team. I remember they already had a prototype up and running. However, bringing a PC-developed game over to console presented quite a few problems. Games developed for PC were focused on using a mouse and keyboard. Basically you’d take your time playing most PC-focused games. You’d play a few games / turns / rounds on the keyboard and mouse and then switch over to doing work on those same input devices. They were games you could just spend some time on, but didn’t require your whole focus.

Conversely, console games required you to sit in front of the TV with your entire focus on the game. You grabbed the electricity-powered game controller and basically said ‘It’s game time!’ and work was not going to interrupt that. You rapidly pressed multiple buttons, pulled off different moves, and were focused on that experience. For the console industry, they wanted game experiences that had deep focus and interactivity. Since our game was developed on PC, our core focus and experience didn’t fit the target console platform. We needed to develop new ways of interacting with the game that required more actions (button pressing, etc.) from the end user and things to make them constantly focus constantly on the game while playing it, which led to some major design changes.”

On localizing Seaman into English

“I asked Sega of America (SOA) to handle almost all of the localization for the English version of Seaman. The reason for this is that there is a stark cultural difference between different countries and when someone in Japan asks you what your blood type is, it’s the equivalent to asking someone what their zodiac sign is. So translation alone wouldn’t have been enough to develop an English version of Seaman. You needed to pretty much rewrite the script for it to work. SOA knew a third party that was very skilled at rewriting scripts so they focused on that which allowed our team to focus on building out the core system and game without limitations. The one proposal I had was to use Leonard Nimoy as the user guidance voice. 

Japanese is a very nuanced, complicated, feeling-based language. It’s a language great for emphatically sharing subjective feelings, so Japanese really connects to deep parts of a speaker’s emotions. Since I’m not a native speaker of English I cannot comment on the deep connections one may feel when speaking English, but I can say when I read the game comparison between Ecco the Dolphin and Seaman in Rolling Stone magazine, it was clear to me the writers did a great job understanding the core concept of the game.”

On the state of voice recognition technology at the time

“The speech recognition engine we used with Sega in Seaman was licensed from a Belgian company. We used their speech recognition engines as users, not creators. As memory size was limited at that time, the performance of speech recognition was completely different than it is now. Specifically, we used a function called navigation. Currently, the common function is called dictation. When you create a navigation type of conversation, it needs to predict in advance what the user is likely to say and have that as a choice. If the user utters a word that is not an option, it will get an error. Therefore, the writer needs to put in advance what kind of words the user will say and make a conversation that leads to one of them. Because of this you had a structure in which conversations are done while exploring intentions. It was an interesting experience that hadn’t been done in a game before.”

On designing the game around technical limitations

“At that time there weren’t other games besides Seaman that had voice conversations, and not so many even now, so it’s hard to compare. But voice recognition technology itself is evolving rapidly right now — if a clever game designer worked on something I think they could make a really interesting game. However, when speaking of voice recognition as a technology, when you try to cross a certain threshold (like human), intelligence is required, not hearing ability. In other words, what’s indispensable is not voice recognition technology, but rather understanding of artificial intelligence. The speed of advancement of AI is really rapid right now, so I think it’s a fun time.

Getting back to the subject, I was able to finish the development of Seaman specifically because of the limit in technology. Had there been none, I would have never finished it. As I was saying earlier, when bringing a game conceived on a PC to the consumer, there is a big environmental difference called time density. How to go about overcoming that tested my wisdom, one could say.”

On Seaman’s personality

“The reason Seaman’s personality and background are so distinctive is because it contains my philosophical idea, not a specific message. At the beginning of the Seaman project, I firmly decided it would be the opposite of conventional games in three respects:

  • It wouldn’t be cute
  • It would look into the gamer’s world from inside the TV
  • The theme of conversation would be everyday stuff, not a fantasy world

I was convinced that if I stayed true to these three aspects it would turn out completely different from any other game — something different, not often seen. This was my strategy and what I still strive to do, even now.”

On the reaction to Seaman

“The response of Seaman’s release in Japan was much bigger than the United States. It came to be a social phenomenon, such as comedians appearing on Japanese TV shows imitating and wearing Seaman costumes, or the news saying, ‘Games have evolved so much!’

Touching back upon my personal philosophy, when a Japanese creator creates a world in Japanese for Japanese users that has a large impact, the social phenomenon is that it attracts various media and society. In the United States, the phenomenon was like, ‘A relatively rare game has come out. It’s for Sega’s new hardware they’ve been pushing in the console wars.’ So I think the recognition is quite different.

After 20 years, I see current performers on TV in Japan still talking about Seaman. They said it was because it had a big impact on them when they were children. They joked and said, ‘Seaman traumatized me in my early childhood.’ As a creator, this is a great honor.”

On SeaMail, the Windows desktop spinoff

SeaMail is a kind of prototype on a more versatile PC platform that exchanges slightly more complicated words. Due to budget constraints, it didn’t get very far in completion but was a very ambitious attempt. So Seaman, unlike the user, lives in the desktop world of a multipurpose PC and looks at the content of the user’s emails, frequency of interaction with specific people, and comments about the user’s relationship with others. It was a lot like the current Siri. This knowledge exists in the artificial conversation engine I’m working on now.”


On using the microphone in Odama

“The purpose of the microphone in Odama is completely different from Seaman. Odama is a game that simulates a fray of soldiers. I was interested in this so-called ‘samurai crowd.’ The reason for the microphone was because both of the user’s hands are occupied by the pinball. Voice was used for nothing more than giving commands. I was also motivated to use it because I wanted to recreate the feeling of a general giving orders to a fray of soldiers in feudal battles. I wanted it to feel like a football coach giving orders like ‘push’ and ‘to the right.’ That’s it.

For me, Odama was the greatest masterpiece in my life, but the launch was delayed over and over. I think I really annoyed Nintendo. The programmers at that time had no idea about controlling crowds of people with autonomous robots or artificial intelligence so development was set back quite a bit.”

On current AI and voice assistant tech

“Artificial intelligence is at a very advanced stage right now, but the method of conversation within Japanese has not changed in 20 years. Seaman was a game where the planner wrote preexisting scenarios and people enjoyed how they played out.

Infinite conversation was impossible. So, I’m no longer interested in that project. The scenarios can only be increased by two or four times compared to the previous game. That’s why I started the Artificial Intelligence Laboratory to create an engine that keeps talking even without scenarios. If we can complete this engine, we’ll start developing a ‘Seaman with no end in scenario.’ The new thing will be the voice of the Seaman generated and spoken on the spot in real time, on the spot, not a prerecorded voice. This requires a lot of inventiveness, but right now, we’re just starting to see the light at the end of the tunnel.

Yoot Saito.
Yoot Saito.

Japanese is a very peculiar language in that it seems to have solid grammar, but actually it doesn’t. Well, actually it’s not right to say that it doesn’t. The Japanese grammar we learned in school textbooks is very much alienated from the Japanese we use in conversation, which is not a real grammar. This divergence and contradiction has been ignored for decades. There was no need to do something so costly such as affixing new grammar. In Japanese, there are two or three subjects. However, the SVOC style proposed by the United States after the war has come to be used. Linguists have long argued for this, but none of them, who stick to writing dissertations, have made an AI engine and attempted to prove their hypothesis is correct.

And now is the era of conversation engines. It may not have been necessary to redefine the grammar for schools, but it is necessary to logically redefine Japanese in order for a robot to speak. So that’s what we’ve been working on.

It’s been difficult. Trial and error, on and off for about 10 years. When I changed the grammar to two or three subjects through digitization and symbols, I started to succeed. My experience in the development of Seaman and with nuance in doing voice acting helped make this possible — particularly the fact that the meaning of a word changes depending on how ‘melody’ is applied. We noticed that Japanese grammar resides in melodies, not letters. This wasn’t apparent during the age of paper and pen. I had overlooked this as nuance, when in fact, it has been a pillar of Japanese grammar, and as the voice actor of the Seaman series, I became aware of this notion and decided to develop a melody recognition engine.”

On current projects

“Right now I’m creating a conversation engine that keeps talking even if there is no scenario. To that end, we have established an organization called the ‘Seaman Artificial Intelligence Laboratory.’ When this engine is complete, I think voice recognition in conversation will change greatly.

I expect a talking robot product to be released in Japan in about half a year. Once the ball gets rolling, I would like to license this engine to many machines and create an era in which Japanese-made microwave ovens, cars, cameras, and robots speak fluently. We are looking forward to seeing it happen and are looking for people and investors who will help us with English. An English site is still being prepared, but you can contact the website:”