Writing a Fuzzy Search Component With Preact and Fuse for Astro

Lloyd Atkinson
Approximate string matching being used as the basis for a search component
Approximate string matching being used as the basis for a search component

Here is what I’m going to build. I implemented a generic search component and then have specialised search components that compose it - for example, <ArticleSearch />.

String searching can be a complex and involved task. Only the simplest of textual searches can be achieved with merely checking for the presence of a substring. I’ve done a little bit of work with string metrics before using Damerau–Levenshtein distance. Seeing this (or really, any similar string metric algorithm) in action is incredibly satisfying. Sometimes this is called “fuzzy search” or “fuzzy matching”.

I needed to build a search component for my site. It needed to be generic enough that it could be used to search any array of objects passed to it - for any number of pages and features I build as part of the site’s design system. It also needed to implement the string metric-based searching. The previous search feature I implemented for articles was specifically designed to only search articles.

Furthermore, it was only a very simplistic search approach. It did not take into account spelling mistakes or similar words. It was only a simple substring search. When I wrote it I wanted to achieve a minimum working search feature with the goal of refactoring it in the future - which is what this article is about.

This code is OK. It’s fast, very simple, and very easy to understand. However, it does not scale well - it doesn’t search nested data, account for spelling mistakes, and is not very efficient as it does not build an index of the data. There are solutions for some of this. For example, I could cache the search results based on the search term.

foundTerms () {
if (!this.search) return [];
if (this.search === '') return [];
if (this.search.length <= minimumLength) return [];
return this.searchData.filter(content => {
return this.keys.some(key => {
return content[key]
.trim()
.toLowerCase()
.replace('"', '')
.includes(this.search.toLowerCase().trim());
});
});
}

Building an improved search component

With some improvements identified I was excited to build the new component. Like the previous component, this will be entirely client side. It’s faster in this scenario, and as I’m using a static site generator framework, all the required data is available at build time. As this is purely client side, I opted for the popular Fuse library. It’s worth mentioning that this is a pure JavaScript library so it can be used with Node too.

I chose Preact for it’s small bundle size. This means that there’s going to be very little overhead especially on pages that don’t otherwise have any JavaScript. By utilising a custom hook and the useMemo() hook I’m caching results for the identical queries. Additionally, Fuse is also maintaining it’s own index. The result is a very fast search.

Creating types

Of course, I’m using TypeScript. So naturally I needed to create some types for the component API. I wrote this component with the aim of keeping it reusable without making it only for searching articles. I’d like it to be a reusable component in my site’s design system. With that aim I created the following type:

export type SearchDefinition = {
readonly primary: string;
readonly secondary?: readonly string[];
readonly url?: string;
};

I decided on a few characteristics for how the search component will work.

  • A primary property to index and search on - imagine this being article title, file name, or other generally unique attributes
  • A collection of secondary properties to index and search on - imagine these being description, subtitle, categories, location, and so on
  • Fuse has functionality that allows for weighted and score-based results - while I use the defaults currently, these could be adjusted to align with the primary and secondary properties
  • An optional URL for the location - if this is not set plain text will be rendered instead of a link

Creating a custom Preact hook for Fuse

The hook uses useMemo() on both the Fuse instance and the search results. The hook accepts three parameters, the collection to search, the search term, and options. Other than setting the primary and secondary keys to search on I left the rest of the options with their defaults. I experimented with the parameters and thresholds with mixed results. It really depends on what type of content you’re searching for. For example from previous experiences of using Damerau–Levenshtein distance for matching badly formatted user inputted addresses with official records I spent a lot of time tweaking the sensitivity and running tests to arrive at relevant results.

const useFuse = <T,>({
collection,
searchTerm,
options,
}: {
readonly collection: readonly T[],
readonly searchTerm: string,
readonly options: Fuse.IFuseOptions<T>,
}) => {
const fuse = useMemo(() => {
return new Fuse(collection, options)
}, [collection, options]);
const results = useMemo(() => {
return fuse.search(searchTerm);
}, [fuse, searchTerm]);
return results;
};

Stay up to date

Subscribe to my newsletter to stay up to date on my articles and projects

The components

The <Search /> component uses the Fuse hook and two further components for displaying the results. I’ve included the relevant code minus the styling, layout, search icon etc. to highlight the relevant parts.

SearchResultList

This is a simple list component that will conditionally render a <span> or a <a> depending on if the item URL property is set.

const SearchResultList = ({ results }: { readonly results: readonly SearchDefinition[] }) => (
<ul>
{
results.map((result) => (
<li key={ result.primary }>
{
result.url
? <a href={ result.url } >{ result.primary }</a>
: <span>{ result.primary }</span>
}
</li>
))
}
</ul>
);

SearchResultCount

This one is self-explanatory. If there is any result a count is shown otherwise nothing is rendered. I could have designed this to display a 0 when no results have been found but I felt that it looked fine as it was.

const SearchResultCount = ({ results }: { readonly results: readonly SearchDefinition[] }) => {
const hasResults = results.length > 0;
return hasResults
? <div>{results.length} results found</div>
: null;
};

This component uses the hook and the two previous components. I limited the displayed results to 20 because in this use case I feel any further results are probably just noise. The user can further refine their search.

export const Search = ({
list,
}: {
readonly list: readonly SearchDefinition[],
}) => {
const [collection] = useState<readonly SearchDefinition[]>(list);
const options: Fuse.IFuseOptions<SearchDefinition> = {
keys: ['primary', 'secondary'],
}
const [searchTerm, setSearchTerm] = useState('');
const filteredList =
useFuse<SearchDefinition>({ collection, searchTerm, options })
.map<SearchDefinition>((result) => ({ primary: result.item.primary, secondary: result.item.secondary, url: result.item.url }));
const filteredListTop = filteredList.slice(0, 20);
return (
<>
<input
id="search"
name="search"
type="search"
autocomplete="off"
placeholder="Search by title or description"
onInput={ (event) => setSearchTerm((event.target as HTMLInputElement).value) } />
<SearchResultCount searchTerm={ searchTerm } results={ filteredList } />
<SearchResultList results={ filteredListTop } />
</>
);
};

ArticleSearch

The existing <Search /> component is a great foundational component for implementing search. The props provided to it are all that’s needed. Due to this, I can now build specialised search components that compose the original one. I’m using Astro as the static site framework for my site. So, I built an Astro component (ArticleSearch.astro) that wraps the existing Preact component. The Astro component loads the articles at build time.

I plan on eventually abstracting the logic you see into their own relevant ES modules. Essentially a series of steps are carried out, mapping and transforming as needed. Note the client:idle Astro directive which ensures the component is dynamically loaded after the rest of the requests have finished ensuring a smooth page load.

---
import apaStyleCasing from '../../utilities/typography/apa-style-casing';
import { SearchDefinition, Search } from './Search';
const allArticles =
(await Astro.glob<Article>('../../pages/posts/**/*.mdx'))
.map((article) => ({
...article.frontmatter,
url: article.url,
title: apaStyleCasing(article.frontmatter.title),
dates: { published: new Date(article.frontmatter.dates.published as unknown as string) }
}));
const articleTitles = allArticles
.map<ArticleMeta>((article) => ({
title: article.title,
description: article.description,
categories: article.categories,
url: article.url,
}));
const articlesInSearchForm = allArticles
.map<SearchDefinition>((article) => ({
primary: article.title,
secondary: article.categories,
url: article.url,
}));
---
<Search
client:idle
list={ articlesInSearchForm } />

Unit testing

I use the Vitest framework for JavaScript/TypeScript unit testing. Additionally, for these Preact components I used the official unit testing library.

What I’d like to do next

  • Add the ability to highlight the exact result - Fuse already provides the offsets into strings needed for this
  • Build more specific search components as they are needed for any future content I add to my site
  • My design system is currently lacking a dedicated component for text input - once I’ve implemented this any text inputs (including for search component) will use this

Conclusion

I’m really pleased with this component. It works really well and only took, according to Git, two days to implement and one day to write this post on it. It’s ability to handle spelling mistakes and provide fuzzy search results is great. It was an exceptionally smooth experience overall with no major difficulties. Let me know what you think about it!

Share:

Spotted a typo or want to leave a comment? Send feedback

Stay up to date

Subscribe to my newsletter to stay up to date on my articles and projects

© Lloyd Atkinson 2024 ✌

I'm available for work 💡