What the heck?

Apr 24, 2021

I had a little bit of time this days so I decided to share some interesting information. In my Corona homeoffice I created a small dark and light mode switchter, but hands free... a little investigating and feq hours later I got this result:

So.... lets get startet to explain

Training the model

First, I trained the model to make sure detecting sounds of clapping hands would work. I used the Teachable machine platform (opens new window),and started an audio project.

thie first step ist to record some examples for your background sounds. After that you can record some other classes (definitions) for example a clap with your hand.

For example, I also recorded samples of speech so the model would be able to recognise what speech "looks like" and not mistake it for clapping hands.

After you have record some example sounds and added it to classes, you can start to train the model. After the train, you can export the model for ready to use. I decided to host it in google, because it is the fastest and easiest way to store the trained model.

Building the Chrome extension

I'm not gonna go into too much detail about how to build a Chrome extension because there are a lot of different options, but here's what I did for mine.

You need at least a manifest.json file that will contain details about your extension.

{
    "name": "Darkmode Clapper",
    "description": "Toggle dark mode by clapping your hands!",
    "version": "1.0",
    "manifest_version": 3,
    "permissions": ["storage", "activeTab"],
       "content_scripts": [
      {
        "js": ["content.js"],
        "matches": ["https://app.netlify.com/*"],
        "all_frames": true
      }
    ]
  } 

The most important parts for this project is the permissions and content scripts.

Content script

Content scripts (opens new window)are files that run in the context of web pages, so they have access to the pages the browser is currently visiting.

Depending on your configs in the manifest.json file, this would trigger on any tab or only specific tabs. As I added the parameter "matches": ["https://app.netlify.com/*"],, this only triggers when I'm on my webapplication app.

Then, I can start triggering the code dedicated to the sound detection.

Setting up TensorFlow.js to detect sounds

When working with TensorFlow.js, I usually export the model as a file on my machine, but this time I decided to use the other option and upload it to Google Cloud. This way, it's accessible via a URL. Otherwise you can store it in your own webserver or storage like azure blob storage.

No you can start hacking and you need to start by creating your model:

const URL = SPEECH_MODEL_TFHUB_URL; //URL of your model uploaded on Google Cloud.
const recognizer = await createModel();

async function createModel() {
  const checkpointURL = URL + "model.json";
  const metadataURL = URL + "metadata.json"; // model metadata

  const recognizer = speechCommands.create(
    "BROWSER_FFT",
    undefined,
    checkpointURL,
    metadataURL
  );

  await recognizer.ensureModelLoaded();
  return recognizer;
}

And once it's done, you can start the live prediction:

const classLabels = recognizer.wordLabels(); // An array containing the classes trained. In my case ['Background noise', 'Clap', 'Speech']

recognizer.listen(
  (result) => {
    const scores = result.scores; // will be an array of floating-point numbers between 0 and 1 representing the probability for each class
    const predictionIndex = scores.indexOf(Math.max(...scores)); // get the max value in the array because it represents the highest probability
    const prediction = classLabels[predictionIndex]; // Look for this value in the array of trained classes

    console.log(prediction);
  },
  {
    includeSpectrogram: false,
    probabilityThreshold: 0.75,
    invokeCallbackOnNoiseAndUnknown: true,
    overlapFactor: 0.5,
  }
);

If everything works well, when this code runs, it should log either "Background noise" or "Clap" based on what is predicted from live audio data.

Now, to toggle Netlify's dark mode, I replaced the console.log statement with some small logic. The way dark mode is currently implemented is by adding a tw-dark class on the body.

if (prediction === "Clap") {
  if (document.body.classList.contains("tw-dark")) {
    document.body.classList.remove("tw-dark");
    localStorage.setItem("nf-theme", "light");
  } else {
    document.body.classList.add("tw-dark");
    localStorage.setItem("nf-theme", "dark");
  }
}

I also update the value in localStorage so it is persisted.

Install the extension

To be able to test that this code works, you have to install the extension in your browser.

(Before doing so, you might have to bundle your extension, depending on what tools you used.)

To install it, the steps to follow are:

  • Visit chrome://extensions
  • Toggle Developer mode located at the top right of the page
  • Click on Load unpacked and select the folder with your bundled extension

If all goes well, you should visit whatever page you want to run your extension on and it should ask for microphonepermission to be able to detect live audio, and start predicting!

That's it! Overall, this project wasn't even really about toggling dark mode but I've wanted to learn about using TensorFlow.js in a Chrome extension for a while so this seemed like the perfect opportunity! 😃

You can checkout the source of this project on mit Github