{"id":14937,"date":"2025-01-22T14:30:49","date_gmt":"2025-01-22T14:30:49","guid":{"rendered":"https:\/\/temperies.com\/?p=14937"},"modified":"2025-01-22T14:30:50","modified_gmt":"2025-01-22T14:30:50","slug":"find-it-like-magic-smarter-text-searches-with-vectors","status":"publish","type":"post","link":"https:\/\/temperies.com\/es\/2025\/01\/22\/find-it-like-magic-smarter-text-searches-with-vectors\/","title":{"rendered":"Find It Like Magic: Smarter Text Searches with Vectors!"},"content":{"rendered":"<h1>Find It Like Magic: Smarter Text Searches with Vectors!<\/h1>\n\n\n\n<p><em>by Gabriel Vergara<\/em><\/p>\n\n\n\n<h2>Introduction<\/h2>\n\n\n\n<p>When you think about searching for text, you probably imagine SQL queries with <code>LIKE '%keyword%'<\/code> or even complex <strong>regular expressions<\/strong>. But as soon as you deal with real-world data\u2014where typos, synonyms, and phrasing differences exist\u2014those traditional methods start to fall short. That\u2019s where <strong>vector databases<\/strong> come in.<\/p>\n\n\n\n<p>Initially, I discovered vector databases as part of implementing <strong>Retrieval-Augmented Generation (RAG)<\/strong> for AI-driven Q&amp;A systems. But along the way, I realized something interesting: <strong>vector databases are not just for RAG<\/strong>. They are a powerful tool for improving search itself, even without an AI model answering questions.<\/p>\n\n\n\n<p>Unlike SQL or regex-based searches, which rely on exact text matching, vector databases allow <strong>fuzzy, meaning-based searches<\/strong>. This means you can find relevant results even if the search text doesn\u2019t match exactly. Imagine searching for &#8220;The Dark Night&#8221; and still getting results for &#8220;The Dark Knight&#8221;\u2014something that would be tricky with traditional methods.<\/p>\n\n\n\n<h3>What We\u2019ll Build<\/h3>\n\n\n\n<p>To see this in action, we\u2019ll build a <strong>similarity search engine<\/strong> using:<\/p>\n\n\n\n<ul><li><strong>LangChain<\/strong> (for easy vector search integration)<\/li><li><strong>Ollama<\/strong> (for easy embedding model handling)<\/li><li><strong>FAISS<\/strong> (a fast, open-source vector database)<\/li><li><strong>Pandas<\/strong> (for handling and analyzing our dataset)<\/li><\/ul>\n\n\n\n<p>We\u2019ll work with an <strong>IMDB movie dataset<\/strong>, allowing us to search for <strong>movies by name or description<\/strong>, even if there are typos or variations in phrasing.<\/p>\n\n\n\n<p>If you want a more in-depth theory explanation, do not hesitate in check this article: <a href=\"https:\/\/temperies.com\/es\/2025\/01\/16\/from-messy-files-to-magic-answers-how-rag-makes-ai-smarter-and-life-easier\/\" data-type=\"post\" data-id=\"14927\">From Messy Files to Magic Answers: How RAG Makes AI Smarter (and Life Easier)<\/a><\/p>\n\n\n\n<p>By the end, you\u2019ll have a solid understanding of how vector databases work\u2014not just for AI chatbots, but as a powerful standalone search technology. Let\u2019s get started!<\/p>\n\n\n\n<h2>Prerequisites<\/h2>\n\n\n\n<p>Before diving into the examples, ensure that your development environment is set up with the necessary tools and dependencies. Here\u2019s what you\u2019ll need:<\/p>\n\n\n\n<ol><li><strong>Ollama<\/strong>: A local instance of Ollama is required for embedding operations. If you don\u2019t already have Ollama installed, you can download it from <a href=\"https:\/\/ollama.com\/download\">here<\/a>. This guide assumes that Ollama is installed and running on your machine.<\/li><li><strong>Models<\/strong>: Once Ollama is set up, pull the required models.<ul><li><strong>all-minilm embedding model<\/strong>: Used to create embeddings for document chunks (<a href=\"https:\/\/ollama.com\/library\/all-minilm\">check here<\/a>).<\/li><\/ul><\/li><li><strong>Python Environment<\/strong>:<ul><li>Python version: This script has been tested with Python 3.10. Ensure you have a compatible Python version installed.<\/li><li>Installing Dependencies: Use a Python environment management tool like <code>pipenv<\/code> to set up the required libraries. Execute the following command in your terminal:<\/li><\/ul><\/li><\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>pipenv install langchain langchain-community langchain-ollama faiss-cpu pandas<\/code><\/pre>\n\n\n\n<h2>1. Setting Up Our Vector Database<\/h2>\n\n\n\n<p>Before diving into the code, let\u2019s set the stage. We\u2019re working with a <strong>free IMDB dataset<\/strong> extracted from <a href=\"https:\/\/www.kaggle.com\/datasets\/rajugc\/imdb-movies-dataset-based-on-genre\/data\">this Kaggle source<\/a>, specifically the <strong>&#8220;animation&#8221;<\/strong> movie subset in CSV format.<\/p>\n\n\n\n<p>To transform text-based movie data into vector representations, we\u2019ll use <strong>FAISS<\/strong> (a fast vector search library) and <strong>Ollama\u2019s all-minilm embedding model<\/strong> (remember to pull <a href=\"https:\/\/ollama.com\/library\/all-minilm\">the model<\/a> to be able to use it). Since processing large datasets at once can be demanding, we\u2019ll <strong>convert data in batches<\/strong> for better performance and efficiency.<\/p>\n\n\n\n<p>The <strong>full script will be available at the end<\/strong> of this article, but let\u2019s go step by step and dissect the key parts.<\/p>\n\n\n\n<h3>1.1. Configuration and Setup<\/h3>\n\n\n\n<p>The script starts by defining important parameters:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>embedding_model = 'all-minilm'  # can also try with 'nomic-embed-text'\nmovie_csv_file = \".\/movie_dataset\/animation.csv\"\nvectorstore_path = f'vectorstore_{embedding_model}'\nbatch_size = 32<\/code><\/pre>\n\n\n\n<ul><li>We specify the <strong>embedding model<\/strong> (<code>all-minilm<\/code>) to convert text into vectors.<\/li><li>The script reads data from the <strong>animation movie dataset<\/strong> (<code>animation.csv<\/code>).<\/li><li>We define the <strong>vectorstore directory<\/strong> (<code>vectorstore_all-minilm<\/code>), where our FAISS index will be saved.<\/li><li>The script processes <strong>32 movies at a time<\/strong> to <strong>reduce memory usage<\/strong>.<\/li><\/ul>\n\n\n\n<h3>1.2. Removing Old Vector Stores<\/h3>\n\n\n\n<p>Before creating a new vector database, we check if an existing one exists and delete it to start fresh:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>if os.path.isdir(vectorstore_path):\n    shutil.rmtree(vectorstore_path)<\/code><\/pre>\n\n\n\n<p>This ensures that we always work with a <strong>clean FAISS index<\/strong> instead of appending to outdated data.<\/p>\n\n\n\n<h3>1.3. Loading and Validating the Dataset<\/h3>\n\n\n\n<p>Next, we load the CSV file into a Pandas DataFrame and <strong>ensure it contains the necessary columns<\/strong> (<code>movie_name<\/code> and <code>description<\/code>):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>df = pd.read_csv(movie_csv_file)\nif \"movie_name\" not in df.columns or \"description\" not in df.columns:\n    raise ValueError(\"CSV file must contain 'movie_name' and 'description' columns.\")<\/code><\/pre>\n\n\n\n<p>If these columns are missing, the script will raise an error. This step is important because <strong>vectorization relies on movie names and descriptions<\/strong> to generate embeddings.<\/p>\n\n\n\n<h3>1.4. Initializing FAISS and the Embedding Model<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>vectorstore_embeddings = OllamaEmbeddings(model=embedding_model)\nvectorstore_index = None  # Placeholder for FAISS index<\/code><\/pre>\n\n\n\n<ul><li>We use <strong>Ollama\u2019s embedding model<\/strong> to generate text embeddings.<\/li><li>The <strong>FAISS index<\/strong> starts as <code>None<\/code> because we\u2019ll create it dynamically while processing batches of movies.<\/li><\/ul>\n\n\n\n<h3>1.5. Batch Processing for Efficient Indexing<\/h3>\n\n\n\n<p>To handle large datasets efficiently, we <strong>process movies in batches<\/strong> rather than all at once:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>for i in tqdm(range(0, len(df), batch_size), desc=\"Indexing Progress\"):\n    batch_df = df.iloc&#91;i:i + batch_size]\n    documents = &#91;\n        Document(\n            page_content=f\"Title: {row&#91;'movie_name']} | Description: {row&#91;'description']}\",\n            metadata={\"movie_name\": row&#91;\"movie_name\"], \"description\": row&#91;\"description\"]}\n        )\n        for _, row in batch_df.iterrows()\n    ]<\/code><\/pre>\n\n\n\n<ul><li>We iterate over the dataset in chunks of <strong>32 movies per batch<\/strong>.<\/li><li>Each movie is converted into a <strong>LangChain Document<\/strong>, combining the <strong>title and description<\/strong> into a single searchable text entry.<\/li><li>Metadata (original movie name and description) is stored alongside the text for reference.<\/li><\/ul>\n\n\n\n<h3>1.6. Creating and Updating the FAISS Index<\/h3>\n\n\n\n<p>Once we have a batch of movie documents, we either <strong>create a new FAISS index<\/strong> or <strong>add to an existing one<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>if vectorstore_index is None:\n    vectorstore_index = FAISS.from_documents(documents, vectorstore_embeddings)\nelse:\n    vectorstore_index.add_documents(documents)<\/code><\/pre>\n\n\n\n<ul><li>If this is the <strong>first batch<\/strong>, we initialize the FAISS index.<\/li><li>Otherwise, we <strong>incrementally add<\/strong> new documents to the existing index.<\/li><li>This structure makes it easy to handle datasets of <strong>any size<\/strong> without overloading memory.<\/li><\/ul>\n\n\n\n<h3>1.7. Saving the Vector Database<\/h3>\n\n\n\n<p>After processing all batches, we <strong>save the FAISS index<\/strong> to disk for future use:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>vectorstore_index.save_local(vectorstore_path)<\/code><\/pre>\n\n\n\n<p>This allows us to <strong>reuse the indexed data later<\/strong> without needing to reprocess the entire dataset.<\/p>\n\n\n\n<h3>Wrapping Up<\/h3>\n\n\n\n<p>At this point, we have successfully:<\/p>\n\n\n\n<ul><li>Loaded an <strong>IMDB animation movie dataset<\/strong><\/li><li>Converted movie names and descriptions into <strong>vector embeddings<\/strong><\/li><li>Indexed the data in <strong>FAISS<\/strong>, a fast and scalable vector database<\/li><li>Saved the index for later searches<\/li><\/ul>\n\n\n\n<p>In the next section, we\u2019ll use this <strong>vectorized database<\/strong> to perform <strong>fast, fuzzy movie searches<\/strong>\u2014finding relevant results even when queries contain typos or phrasing variations.<\/p>\n\n\n\n<h2>2. Searching for Movies in the Vector Database<\/h2>\n\n\n\n<p>Now that we have built and stored our <strong>vectorized movie database<\/strong>, it\u2019s time to put it to work! In this section, we\u2019ll explore how to search for movies using FAISS and LangChain.<\/p>\n\n\n\n<p>This script allows you to enter <strong>a movie title or description<\/strong>, and it will return the <strong>most similar results<\/strong> from our FAISS index. The key concept here is <strong>vector similarity<\/strong>\u2014the closer a movie\u2019s embedding is to the search query, the better the match.<\/p>\n\n\n\n<p>As always, <strong>the full script will be available at the end<\/strong> of this article, but let\u2019s go through its most important parts.<\/p>\n\n\n\n<h3>2.1. Defining the Search Function<\/h3>\n\n\n\n<p>At the heart of this script is the <strong><code>search_movies<\/code><\/strong> function, which retrieves similar movies based on a given query:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>def search_movies(query, top_k=5):\n    results = vectorstore.similarity_search_with_score(query, k=top_k)\n    return &#91;\n        (doc.metadata&#91;\"movie_name\"], doc.metadata&#91;\"description\"], score)\n        for doc, score in results\n    ]<\/code><\/pre>\n\n\n\n<h4>How does it work?<\/h4>\n\n\n\n<ul><li>The <strong>query<\/strong> (movie name or description) is passed into FAISS.<\/li><li>FAISS performs a <strong>similarity search<\/strong> and returns the <strong>top K matches<\/strong> (default: 5).<\/li><li>Each result includes:<ul><li>The <strong>movie title<\/strong><\/li><li>The <strong>movie description<\/strong><\/li><li>A <strong>similarity score<\/strong> (lower scores mean better matches)<\/li><\/ul><\/li><\/ul>\n\n\n\n<h4>What is the similarity score?<\/h4>\n\n\n\n<p>The <strong><code>similarity_search_with_score<\/code><\/strong> function measures how close each movie\u2019s vector is to the query.<\/p>\n\n\n\n<ul><li>A score <strong>closer to 0<\/strong> means a <strong>better match<\/strong>.<\/li><li>Higher scores indicate <strong>less relevant results<\/strong>.<\/li><\/ul>\n\n\n\n<p>For example, if you search for <code>\"Finding Nimo\"<\/code>, it might still return <code>\"Finding Nemo\"<\/code> because the vectors are similar, even though the text isn\u2019t an exact match.<\/p>\n\n\n\n<h3>2.2. Loading the Vector Database<\/h3>\n\n\n\n<p>Before searching, we need to <strong>load our FAISS vector store<\/strong> and its embeddings:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>embedding_model = 'all-minilm'  # can also try with 'nomic-embed-text'\nvectorstore_path = f'vectorstore_{embedding_model}'\nvectorstore_embeddings = OllamaEmbeddings(model=embedding_model)\nvectorstore = FAISS.load_local(vectorstore_path, vectorstore_embeddings, allow_dangerous_deserialization=True)<\/code><\/pre>\n\n\n\n<ul><li>We set up the <strong>same embedding model<\/strong> (<code>all-minilm<\/code>) that we used for indexing.<\/li><li>We define the <strong>path to our stored FAISS vector database<\/strong>.<\/li><li>We <strong>reload the vector index<\/strong> using the embeddings so that we can perform searches.<\/li><\/ul>\n\n\n\n<p>The <strong><code>allow_dangerous_deserialization=True<\/code><\/strong> flag is required when loading a local FAISS index. It enables compatibility when retrieving stored vectors.<\/p>\n\n\n\n<h3>2.3. Interactive Search Loop<\/h3>\n\n\n\n<p>Now, we enter a <strong>loop<\/strong> that allows the user to keep searching for different movies until they decide to exit:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>search_another = True\nwhile search_another:\n    print('-' * 80)\n    movie_search_query = input(\"Enter a movie\/description to search for: \")\n    movie_results = search_movies(movie_search_query)\n\n    print(\"\\n---- Similar Movies --------------------\\n\")\n    for name, description, score in movie_results:\n        print(f\"- Title: {name}\\n\\t- Score: {score:.4f}\\n\\t- Description: {description}\\n\")\n\n    search_another = input(\"Search again? &#91;Y,n]: \").lower() in &#91;'y', '']<\/code><\/pre>\n\n\n\n<ol><li>The script <strong>asks for a search query<\/strong> (a movie title or a description).<\/li><li>It <strong>retrieves the top matches<\/strong> using <code>search_movies(query)<\/code>.<\/li><li>Results are displayed in a readable format:<ol><li><strong>Movie Title<\/strong><\/li><li><strong>Similarity Score<\/strong> (lower is better)<\/li><li><strong>Movie Description<\/strong><\/li><\/ol><\/li><li>The user is asked if they want to <strong>search again<\/strong> or exit the program.<\/li><\/ol>\n\n\n\n<h3>Example Search Output<\/h3>\n\n\n\n<p>For real world case, there was a movie that I&#8217;ve wanted to recommend to a friend, but did not remember the name. The movie was commedy story about a girl that went over a family trip, and at the same time, an AI rebellion occurs. Using that idea as a search argument, let\u2019s say we search for <strong>&#8220;family trip ai rebellion&#8221;<\/strong>. The output might look something like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>----------------------------------------------------------------------\nEnter a movie\/description to search for: family trip ai rebellion\n\n---- Similar Movies --------------------\n\n- Title: The Hoovers Conquer the Universe\n    - Score: 1.0515\n    - Description: An animated family space adventure\n\n- Title: Untitled Animated Feature Film\n    - Score: 1.1748\n    - Description: A progressive sociopolitical family film.\n\n- Title: Robot Zot\n    - Score: 1.1781\n    - Description: The story of a wayward scout for an invading alien force, whose course goes hopelessly awry when he lands in the yard of a modern-day, suburban family with problems of their own. Based on the book by Jon Scieszka and David Shannon.\n\n- Title: The Mitchells vs the Machines\n    - Score: 1.1836\n    - Description: A quirky, dysfunctional family's road trip is upended when they find themselves in the middle of the robot apocalypse and suddenly become humanity's unlikeliest last hope.<\/code><\/pre>\n\n\n\n<p>The movie was <strong>&#8220;The Mitchells vs the Machines&#8221;<\/strong>, and it appeared in the results. This demonstrates how <strong>vector-based search handles variations<\/strong> much better than traditional SQL or regex searches!<\/p>\n\n\n\n<h3>Wrapping Up<\/h3>\n\n\n\n<p>At this point, we have successfully:<\/p>\n\n\n\n<ul><li>Loaded the <strong>vectorized movie database<\/strong><\/li><li>Implemented <strong>a similarity search function<\/strong><\/li><li>Created an <strong>interactive search loop<\/strong><\/li><li>Returned <strong>relevant movie recommendations based on text similarity<\/strong><\/li><\/ul>\n\n\n\n<p>In the next section, we\u2019ll wrap everything up and provide the <strong>full code<\/strong> so you can try it yourself!<\/p>\n\n\n\n<h2>Full code<\/h2>\n\n\n\n<p>This is the full code of both the vector store feeder as well as the vector store queries.<\/p>\n\n\n\n<h3>index_movie_description.py<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>import os\nimport shutil\nimport pandas as pd\nfrom tqdm import tqdm\n\nfrom langchain_community.vectorstores import FAISS\nfrom langchain_ollama.embeddings import OllamaEmbeddings\nfrom langchain.schema import Document\n\nif __name__ == '__main__':\n    # ---- Configuration --------------------------------------------\n    embedding_model = 'all-minilm' # can also try with 'nomic-embed-text'\n    movie_csv_file = \".\/movie_dataset\/animation.csv\"\n    vectorstore_path = f'vectorstore_{embedding_model}'\n    batch_size = 32\n\n    # ---- Vector Store setup ---------------------------------------\n    print('-' * 80)\n    print(f'&gt; Creating vectorstore: {vectorstore_path} ')\n    # Remove previous vectorstore\n    if os.path.isdir(vectorstore_path):\n        print(f'  &gt; Removing existing vectorstore... ')\n        shutil.rmtree(vectorstore_path)\n\n    # ---- Grab movie .csv file for the vector store\n    print(f'  &gt; Processing movies from: {movie_csv_file} ')\n    # Ensure column names match expected format\n    df = pd.read_csv(movie_csv_file)\n    if \"movie_name\" not in df.columns or \"description\" not in df.columns:\n        raise ValueError(\"CSV file must contain 'movie_name' and 'description' columns.\")\n\n    # Initialize FAISS index &amp; embedding model\n    vectorstore_embeddings = OllamaEmbeddings(model=embedding_model)\n    vectorstore_index = None  # Placeholder for FAISS index\n\n    # ---- Batch Processing ----\n    print(f'  &gt; Indexing {len(df)} movies in batches of {batch_size}...\\n')\n    for i in tqdm(range(0, len(df), batch_size), desc=\"Indexing Progress\"):\n        # Extract batch &amp; convert movie names into Langchain Documents (concatenate movie name and description)\n        batch_df = df.iloc&#91;i:i + batch_size]\n        documents = &#91;\n            Document(\n                page_content=f\"Title: {row&#91;'movie_name']} | Description: {row&#91;'description']}\",\n                metadata={\"movie_name\": row&#91;\"movie_name\"], \"description\": row&#91;\"description\"]}\n            )\n            for _, row in batch_df.iterrows()\n        ]\n        # Initialize or append to FAISS index\n        if vectorstore_index is None:\n            vectorstore_index = FAISS.from_documents(documents, vectorstore_embeddings)\n        else:\n            vectorstore_index.add_documents(documents)\n\n    # ---- Save vector store index\n    print(f'  &gt; Saving vectorstore: {vectorstore_path}')\n    vectorstore_index.save_local(vectorstore_path)\n    print('&gt; Vectorstore created!')<\/code><\/pre>\n\n\n\n<h3>search_movie_description.py<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>from langchain_ollama.embeddings import OllamaEmbeddings\nfrom langchain_community.vectorstores import FAISS\n\n# Query function\ndef search_movies(query, top_k=5):\n    results = vectorstore.similarity_search_with_score(query, k=top_k)\n    return &#91;\n        (doc.metadata&#91;\"movie_name\"], doc.metadata&#91;\"description\"], score)\n        for doc, score in results\n    ]\n\nif __name__ == '__main__':\n    # ---- Configuration --------------------------------------------\n    embedding_model = 'all-minilm' # can also try with 'nomic-embed-text'\n    vectorstore_path = f'vectorstore_{embedding_model}'\n\n    # ---- Load FAISS index -----------------------------------------\n    vectorstore_embeddings = OllamaEmbeddings(model=embedding_model)\n    vectorstore = FAISS.load_local(vectorstore_path, vectorstore_embeddings, allow_dangerous_deserialization=True)\n\n    # ---- Query vector store ---------------------------------------\n    search_another = True\n    while search_another:\n        print('-' * 80)\n        movie_search_query = input(\"Enter a movie\/description to search for: \")\n        movie_results = search_movies(movie_search_query)\n\n        print(\"\\n---- Similar Movies --------------------\\n\")\n        for name, description, score in movie_results:\n            print(f\"- Title: {name}\\n\\t- Score: {score:.4f}\\n\\t- Description: {description}\\n\")\n\n        search_another = input(\"Search again? &#91;Y,n]: \").lower() in &#91;'y', '']<\/code><\/pre>\n\n\n\n<h2>Conclusion: Smarter Searches, Zero Hassle<\/h2>\n\n\n\n<p>We\u2019ve explored how <strong>vector databases<\/strong> can <strong>revolutionize text-based search<\/strong>, making it smarter, more flexible, and typo-resistant. Instead of relying on old-school SQL <code>LIKE<\/code> queries or complex regex patterns, we leveraged <strong>FAISS, LangChain, and Ollama embeddings<\/strong> to perform <strong>meaning-based searches<\/strong> on an IMDB movie dataset.<\/p>\n\n\n\n<h3>Beyond Movies: What\u2019s Next?<\/h3>\n\n\n\n<p>This approach isn\u2019t just for movies\u2014it can be applied to <strong>product searches, document retrieval, customer support chatbots<\/strong>, and any scenario where traditional search struggles with variations in wording.<\/p>\n\n\n\n<p>With <strong>vector search<\/strong>, you can build <strong>Google-like search experiences<\/strong> without needing a full AI-powered chatbot. And the best part? You don\u2019t need <strong>huge infrastructure or deep learning expertise<\/strong>\u2014just smart indexing and a good embedding model.<\/p>\n\n\n\n<p>Want to take it further? Try experimenting with <strong>different embedding models, datasets, or even multimodal (text + images) searches<\/strong>. <\/p>\n\n\n\n<p>Now, it\u2019s your turn\u2014give it a try and let me know what you build!<\/p>\n\n\n\n<h2>About Me<\/h2>\n\n\n\n<p><em>I\u2019m Gabriel, and I like computers. A lot.<\/em><\/p>\n\n\n\n<p>For nearly 30 years, I\u2019ve explored the many facets of technology\u2014as a developer, researcher, sysadmin, security advisor, and now an AI enthusiast. Along the way, I\u2019ve tackled challenges, broken a few things (and fixed them!), and discovered the joy of turning ideas into solutions. My journey has always been guided by curiosity, a love of learning, and a passion for solving problems in creative ways.<\/p>\n\n\n\n<p>See ya around!<\/p>","protected":false},"excerpt":{"rendered":"<p>Unlike SQL or regex-based searches, which rely on exact text matching, vector databases allow fuzzy, meaning-based searches. This means you can find relevant results even if the search text doesn\u2019t match exactly. Imagine searching for &#8220;The Dark Night&#8221; and still getting results for &#8220;The Dark Knight&#8221;\u2014something that would be tricky with traditional methods.<\/p>","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[54],"tags":[55,61,68,56,69,59,60],"_links":{"self":[{"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/posts\/14937"}],"collection":[{"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/comments?post=14937"}],"version-history":[{"count":6,"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/posts\/14937\/revisions"}],"predecessor-version":[{"id":14943,"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/posts\/14937\/revisions\/14943"}],"wp:attachment":[{"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/media?parent=14937"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/categories?post=14937"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/temperies.com\/es\/wp-json\/wp\/v2\/tags?post=14937"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}