Automated Updates: Keeping Algolia Synced with CSV Data

Automated Updates: Keeping Algolia Synced with CSV Data

Automate CSV syncing, index data in Algolia, and display it on a live website

This blog explains how to automate CSV syncing, index data in Algolia, and display it on a live website, with the flexibility to update CSV data anytime.

Step 1:
Create a CSV file to serve as the data source.

To simplify the process, I am using Google Sheets. Start by creating a new Google Sheet and structuring it with the columns and rows containing the data you want to fetch. Once your sheet is ready, you can publish it as a CSV file to automate data syncing. This sheet will act as the foundation for indexing your data.

The image above illustrates an example of my dummy data.

After creating your Google Sheet, publish it:

  1. Click on File > Share > Publish to web.

Select the specific sheet you want to publish and choose the CSV format.

In the example above, I’ve already published Sheet 1 as a CSV file.

Copy the URL and ensure it’s specifically for the CSV data format.

After generating the URL, head to Algolia to configure the necessary settings.

Step 2:

Click on Data SourceConnectors, and select CSV from the options.

Once you select CSV, you will see the configuration options, similar to the one shown in the image above.

Select the configuration according to your needs.

In the Hosted URL section, paste the URL you copied earlier. Fill in the other necessary fields as required.

For Column Mapping, specify all the column names you want to fetch. If you don't do this, all the data will be stored under a single attribute called "text".

After configuring everything, click on Create Source.

You’ll be given an option to make changes to the fetched data, such as removing unnecessary data or applying filters. If you want to make these adjustments, select the option to open the editor and write custom JavaScript for it.

Otherwise, you can skip this step and proceed without making any changes.

In the Configure Destination section, select the index name you want to use. If you’ve already created an index, choose that.

If you haven't created an index yet, simply enter the desired index name, and Algolia will automatically create it for you.

Make sure to create the index credentials as well.

click on create destination

below configuration for automation

If you want Algolia to fetch data every hour from your Excel file and update it on your live website, click on Schedule and select 1 hour.

Then, decide how you want to handle the data updates:

  • Replace: This will replace the entire data with the new data.

  • Save: This will only update specific records* without affecting the entire dataset.

      {
          objectID:"9"
          url:"https://fakestoreapi.com/products/9"
          country:"Japan"
          firstName:"Noah"
          id:"9"
          lastName:"Walker"
      }
    
  • *Once Algolia fetches the data, it will set the records in the index like this

  • Partial Update: This will update specific fields within the record, rather than replacing the whole record.

  • Once you’ve configured everything, click on Create Task and then click on Run to start the process. This will trigger Algolia to fetch the data, index it, and apply the configurations you’ve set, such as scheduling and data handling options.

  • Now, you can check your index to confirm that the data has been fetched successfully.

  • Algolia dashboard → Search → navigate to the relevant index

Step 3:

To implement the functionality in your code, start by installing the InstantSearch package to use Algolia’s predefined components. Run the following command:

npm install algoliasearch react-instantsearch
# or
yarn add algoliasearch react-instantsearch

Below is an example of how you can implement and fetch data for CSV.

import {
  InstantSearch,
  RefinementList,
  Hits,
  SearchBox,
  Pagination,
  ClearRefinements,
  Stats,
  Configure,
} from "react-instantsearch";
import { liteClient as algoliasearch } from "algoliasearch/lite";

const searchClient = algoliasearch(
  process.env.NEXT_PUBLIC_ALGOLIA_APP_ID as string,
  process.env.NEXT_PUBLIC_ALGOLIA_SEARCH_API_KEY as string
);

function Hit({ hit }: any) {
  return (
    <div
      className="border border-gray-300 rounded-lg p-4 mb-4 shadow-sm"
      data-insights-object-id={hit.objectID}
      data-insights-position={hit.__position}
      data-insights-query-id={hit.__queryID}
    >
      <h2 className="text-lg font-bold text-gray-800">
        {hit.firstName}
        {""} {hit.lastName}
      </h2>
      <p className="text-gray-600">
        <span className="font-semibold">Country:</span> {hit.country}
      </p>
    </div>
  );
}

export default function Home() {
  return (
    <div className="container mx-auto p-6">
      <InstantSearch searchClient={searchClient} indexName="<your-index-name>">
        <SearchBox
          classNames={{
            root: "relative w-full max-w-lg mx-auto",
            input:
              "w-full py-3 px-4 border rounded-lg shadow-sm focus:outline-none",
            submit: "hidden",
          }}
          placeholder="Short, Finley and Greer"
        />

        <div className="grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-4 gap-6">
          <div className="col-span-1">
            <h3 className="text-lg font-semibold text-gray-800 mb-4">
              Filters
            </h3>
            <div className="bg-gray-50 p-4 border border-gray-200 rounded-lg shadow-sm max-h-80 overflow-y-scroll">
              <ClearRefinements
                classNames={{
                  button:
                    "flex justify-center items-center text-gray-700 cursor-pointer py-1 max-w-fit px-4 rounded-lg border border-gray-300",
                  disabledButton: "cursor-not-allowed",
                }}
              />
            // This RefinementList is optional, for this you need to configure facet in index
              <RefinementList
                attribute="country"
                classNames={{
                  root: "flex flex-col py-2",
                  searchBox:
                    "max-w-fit border border-gray-300 rounded-3xl py-2 px-4 my-4 ",
                  labelText: "ml-2 text-lg",
                  count: "ml-2",
                  disabledShowMore: "text-gray-200",
                  showMore:
                    "max-w-fit text-gray-700 hover:text-gray-900 cursor-pointer py-1 px-4 rounded-lg border border-gray-300 hover:bg-gray-100",
                }}
                showMore={true}
                searchable
              />
            </div>
          </div>

          <div className="col-span-1 sm:col-span-1 lg:col-span-3">
            <h3 className="text-lg font-semibold text-gray-800 mb-4">
              Results
            </h3>
            <Hits hitComponent={Hit} />
            <Configure
              hitsPerPage={5}
              clickAnalytics={true}
              userToken={"user-1"}
            />
            <Stats />
          </div>
        </div>

        <Pagination
          classNames={{
            root: "flex justify-center space-x-2 mt-6",
            item: "px-3 py-1 text-sm text-gray-700 rounded-lg cursor-pointer",
            selectedItem: "bg-gray-600 text-white",
            disabledItem: "text-gray-400 cursor-not-allowed",
            list: "flex justify-center items-center ",
          }}
        />
      </InstantSearch>
    </div>
  );
}

After implementing the above code, you can see the below output structure on your webpage:

Host your application on Vercel or any other hosting platform, or run it locally. Whenever you make changes to the CSV file, Algolia will automatically fetch the updated data in every 1 hour, ensuring your live website stays in sync with the latest data.

References:

Algolia Documentation

ChatGpt