When we left off in Part II, I had just come to the conclusion that, in order to make a truly seamless interface, I would have to design and build my own hardware. It was the only way to get an audio interface free of KO parameters like a short battery life, blocked ears, non-configurability, and an uncomfortable fit.
What did I know about hardware design before I got started?
Having tested a variety of headphones, headsets, and earbuds already, I knew that bone conduction tech was the best bet for getting an all-day experience. The other route was using directional speakers, but I wouldn’t have the option of tuning out of my surroundings by plugging my ears. This was essential because blocking outside noise improves concentration.
In the spirit of J. C. R. Licklider, my main premise was to augment humans in the most natural and efficient way. A way that helps us achieve our goals and be more conscious. An audio interface, as we established in Part II, was the best possible augmentation.
Designing from first principles
We’ll talk about hardware in a lot more detail in a separate blog post. For now, here’s an overview of my personal experience as a complete beginner.
I speculated that when building new hardware, I should start from first principles. How do you set up communication between a human and a computer? The fundamentals for a seamless interface are input, output, and ergonomics.
The goal is to maximise the bandwidth of information exchange and make sure this “relationship” benefits us. On one hand there is a computer, on the other hand there is us. What are the options?
Input: The base layer is the brain, but seamless BCI interfaces are not yet possible (as we discussed in Part II). I moved one layer up to our five senses: Sight, Hearing, Smell, Taste, Touch.
For Sight, AR glasses are too far off to be usable.
Smell and Taste are…well, impractical. They’re rather low bandwidth for information, and the technology to support them will not fit in a wearable (as of now).
Touch is interesting. Subtle nudges or taps can indicate a notification, a change of state. The information is mainly binary unless you are using morse code. However, the bandwidth on any complex information is low.
Hearing: spoken word is high in terms of bandwidth, and a natural way to receive information.
Output: Well, one can imagine countless ways that we express ourselves and relay information. Speech, facial expressions, posture, gesturing, manipulating our environment…or lifting your eyebrows. Remember the weightlifting snapchat filter game?
In terms of human-computer interface, I focused on hand movements and speech. We learn to type fast. An interface made for rapid typing is not wearable—and not always accessible to you seamlessly. Some companies have tried creating wearable keyboards that looked similar to gloves, but people don’t use them. Plus they’re still too large.
Input: Any change in current can be considered an input for computers.
Output: Computers combined with specific hardware can generate sound waves or light pixels, move physical things, print on paper…the only limit is our imagination (and technological advancements).
I filtered through the options and looked at possible output for wearables. Speakers, or bone conduction transducers, to transmit sound. Smart glasses to transmit light via pixels, or LED to indicate state. A touch pad on a headset for haptic taps.
There has to be a way for the person to talk to the computer (output), and a way for the computer to talk to the person (input).
There were too many all-nighters and brainstorm sessions spent imagining the best interface. Crazy and impossible ideas making the best use of today’s tech. It was all worthwhile, but out of the scope of this blog.
Tl;dr—I had to make the interface disappear.
Output: Speech (captured by microphones): Microphones can pick up vibrations from the air, or from the body (Bone conduction microphones. The most common placements are on the throat or in the ear canal. I tested throat microphones and the quality was awful. They’re bulky and uncomfortable. You can watch this video if you want more info.)
In-ear microphones have been used in products such as Bragi Dash to improve sound quality.
Hand movements (captured by touch surfaces): Buttons provide physical certainty that the action was performed. However, the pressure on one’s head is uncomfortable and the button mechanisms take up a lot of space on the wearable.
Touch surfaces felt more natural for interaction. The haptic response has to be built in to provide feedback for the user that the gesture was successful.
Input: I really started from square one for input. How do we hear? Through our ears. What parts of the ear are involved for air conducted versus bone conducted sound? The sound travels through bone, so where on the skull should the transducer go? Forehead? Jaw? Near the ear?
I ordered various types of bone conduction transducers, got in touch with both high-end and low-end transducer manufacturers, and requested samples. Then I tested every possible location on my skull to see which position gave me the best sound quality and clarity.
While figuring this out, I continued wearing various types of bone conduction headsets every day to help me understand what’s essential, what’s wrong with their design decisions, and how my day changed with this new technology.
I was able to stay connected throughout the day to both my surroundings and my phone. Because of bone conduction, I always had a choice between the two. I could easily block out a noisy environment in order to focus, just by putting in ear plugs.
Podcasts and music worked well. Calls and AI assistants were not usable due to poor microphone quality.
This was happening in wintertime, from 2016 to 2017, when I was finishing my degree and working in the design/art studio. I discovered that the back loop of the headset was not compatible with winter clothes.
It caught on jacket and coat hoods, scarves, and hats. The transducers would be pulled out of place, interrupting the audio.
I was also experimenting with various 3D-printed frames for the new design. Putting the transducers in the right place, applying the right pressure, and making sure the microphones were close to the source of sound.
Back to first principles again, I was designing the headset from scratch. I experimented with a single bone conduction earpiece by taking electronics out of one headset and putting them into the 3D printed earpiece frame.
It was large and uncomfortable, but the point was to test my hypotheses as fast as possible, and then worry about miniaturizing the electronics. By doing that, I could make the device more comfortable.
My ears looked like red turnips after a couple hours of testing, and I went through dozens of iterations.
The experiments made it clear: there needed to be two transducers, on both sides of the head. And a simple ear hook didn’t maintain enough pressure on the transducer to conduct sound clearly. The headset would have to have a back loop.
The headset had to be comfortable anywhere, anytime. The only given was the place for transducers. The discomfort was due to the headset getting in the way when I moved my head.
So I 3D scanned my head in all extreme positions (eg: looking up and right) and determined the place on the head that has the lowest delta in length between the two transducers in all of the extreme positions.
(And then later on I managed to get data on about 3,000 3D-scanned human heads from an anthropometry project, and determined the exact difference in lengths among the population. More on that in another blog post.)
I found out that reprogramming the main bluetooth system on a chip (SoC) of an existing headset is a challenge. I was still learning electronics and programming, so it would take too long. However, I wanted to test the extra functionality to see whether it even made sense to try.
The plan was to attach extra hardware to a bone conduction headset I already had. The hardware would make it possible for the headset to communicate with an app. That way I could control the headset and really explore the use cases.
I bought an Arduino kit and put together a prototype with a Bluetooth Low Energy (BLE) board in connection with a capacitive touch sensor board. After I added copper tape to an existing bone conduction headset, it worked.
I wasn’t too concerned with the size of the hardware yet. The point was to get something bootstrapped so I could move on to the use cases.
At this point, I had finished CS50 and prototyped some projects with Arduino.
Since iOS was too restrictive, I bought an Android phone and downloaded Android Studio. After doing the Android programming course by Google, and playing around with various projects for around a week, I built an app to interface with the BLE “kit” in two weeks. (Alt+Enter was my best friend.)
But at this point it did not matter. I wanted to get on with testing the use cases, fast.
I spent most of Summer 2017 testing use cases in my own day-to-day life. I went through the most used Google Assistant applets, Amazon Alexa skills, Apple Watch use cases, and IFTTT recipes to see what people use the most. Also by connecting If This Then That, I was able to use the headset for these use cases:
- Call your phone
- Turn on the TV
- Party time - activate presets
- Bedtime - turn off the lights
- Post to Facebook by voice
- Add new tasks in Todoist
- Add new to-do in iPhone reminders
- Turn off the TV
- Send a text to someone
- Create an event on iPhone calendar
- Add a new contact
- Log notes in a Google Spreadsheet
- Change color of Hues
- Keep a list of notes to email yourself at the end of the day "add to my digest"
- Block the next hour in Calendar
- Set and monitor Nest
- Post a tweet w/voice
- Create a voice note on Evernote
- Set a temperature of Nest thermostat by voice
- Send a note on Slack by voice
- Log weight by voice
- Prioritize connected device w/ voice for WiFi
- Lock a SmartThings lock by voice
The conclusion of the testing? Most use cases needed great microphone quality, which the headsets I tested were lacking. It caused many mistakes.
Second was the fact that there were special use cases like “Launch QR reader” where I had to remember which touch surface it was bound to. Since I used this concrete functionality sporadically, it was easy to forget which touch surface it was bound to.
Waking up in the morning, putting on an open-ear interface, listening to music, listening to podcasts, and calling seamlessly was so much better than carrying around a pair of earphones or talking to a smartspeaker.
These use cases were already enough. Just imagine when the microphones are perfect and the fit is non-obtrusive.
I knew what the headset was capable of and what I could use it for; now I had to figure out how to shrink the hardware. It was over-sized and uncomfortable.
I moved out from my parents house in Slovakia (after working from their basement) to Prague to find people to work with on the project. Took all my savings and decided to pursue the project full-time.
After moving to Prague, I received news that YCombinator wanted to meet me for an interview in Mountain View.
Hell yeah. I booked tickets for November, and meanwhile started to meet with people in Prague who could help move the project to the next level.
I met with an industrial design student named Darek, who had previously worked on developing drones and designed a bus that won first place in a nation-wide competition. We clicked right away and started to brainstorm design ideas, how to solve most issues, and how to 3D print new prototypes.
I visited and joined a local hackerspace, brmlab, to try to find like-minded people. I ran into an electronics engineer who miniaturized my makeshift Arduino prototype onto one board.
We made custom boards and had them assembled to decrease the size and weight of the BLE kit. I was able to reuse the firmware because we recycled the same components.
At this stage, the next prototype was a middle-step to further test possible use cases while getting in touch with consumer electronics manufacturers to develop the headset.
The goal was to take current electronics used in other headsets, and innovate in industrial/mechanical design. That would require improving the ergonomics, and improving the functionality by making our own app that could communicate with the headset.
Y Combinator Interview, November 2017
I packed the prototypes, a bunch of electronics just in case the prototype broke, and a couple of tools (including a soldering iron). It was great to see California again.
YC interviews go fast. 10 minutes, and 5 partners asking questions left and right. Questions are sharp and to the point. They were surprised I had learned programming just for the project, which to me felt like a no-brainer.
How could one not dedicate 100% to build what they believe is important? I did forget to mention to them that I had committed my entire savings to the project too.
The partners seemed keen on the potential of adding BCI sensors to the headset, and using that as the input (which at that time and still now will not yield satisfactory results). I’m all for thinking big, but this was just not possible.
After the interview, back at my Airbnb, I received their rejection email...and then continued working on the app. Jet lag and exhaustion caught up with me eventually.
In the middle of packing the next morning, I couldn’t find my wallet–with all my travel documents. I nearly ran into a yogi outside my bedroom door, just standing on his head, staring blankly at the wall.
I asked him whether he’d seen my wallet...no answer. I looked everywhere with no luck. When I went back, he was in the lotus position, but hadn’t seen my wallet anywhere. I finally found the wallet bundled in my sheets.
My time there was up. I met the yogi on my way out (again doing a headstand) and headed to the Sacramento area to meet the exchange family I had spent a year with in the US when I was in high school.
Overall, the whole Y Combinator application was a great process that got me to think about all aspects of the project. Useful feedback from the YC partners was the cherry on top.
Explore Sentien Audio. The open-ear headset that merges offline and online realities.
This post was a collaboration between Imrich Valach and Liz Windsor.