在專門構(gòu)建的人工智能數(shù)據(jù)庫(kù)年代,像 MariaDB 如此的傳統(tǒng)數(shù)據(jù)庫(kù)怎樣重塑本身以堅(jiān)持干系性?在這篇中找出答案。
作為一名在干系數(shù)據(jù)庫(kù)體系方面擁有二十多年履歷的處理方案架構(gòu)師,我邇來開頭探究 MariaDB 的新矢量版本,看看它對(duì)否可以處理我們面臨的一些人工智能數(shù)據(jù)挑唆。快速欣賞一下仿佛十分有壓服力,尤其是它怎樣將人工智能邪術(shù)直接帶入常規(guī)數(shù)據(jù)庫(kù)設(shè)置中。但是,我想用一個(gè)簡(jiǎn)便的用例來測(cè)試它,看看它在實(shí)踐中的體現(xiàn)怎樣。
在本文中,我將經(jīng)過運(yùn)轉(zhuǎn)一個(gè)簡(jiǎn)便的用例來分享我對(duì) MariaDB向量功效的實(shí)踐履歷和察看。具體來說,我將把示例客戶批評(píng)加載到 MariaDB 中,并實(shí)行快速相似性搜刮來查找干系批評(píng)。
我的實(shí)行從使用包含矢量功效的 MariaDB最新版本 (11.6)設(shè)置Docker容器開頭。
# Pull the latest release
docker pull quay.io/mariadb-foundation/mariadb-devel:11.6-vector-preview
# Update password
docker run -d --name mariadb_vector -e MYSQL_ROOT_PASSWORD=<replace_password> quay.io/mariadb-foundation/mariadb-devel:11.6-vector-preview
如今,創(chuàng)建一個(gè)表并加載示例客戶批評(píng),此中包含每個(gè)批評(píng)的心情評(píng)分和嵌入。為了天生文本嵌入,我使用SentenceTransformer ,它允許您使用事后練習(xí)的模子。具體來說,我決定使用一個(gè)名為 paraphrase-MiniLM-L6-v2 的模子,該模子獲取我們的客戶批評(píng)并將其映射到 384 維空間。
import mysql.connector
import numpy as np
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')
# I already have a database created with a name vectordb
connection = mysql.connector.connect(
host="localhost",
user="root",
password="<password>", # Replace me
database="vectordb"
)
cursor = connection.cursor()
# Create a table to store customer reviews with sentiment score and embeddings.
cursor.execute("""
CREATE TABLE IF NOT EXISTS customer_reviews (
id INT PRIMARY KEY AUTO_INCREMENT,
product_name INT,
customer_review TEXT,
customer_sentiment_score FLOAT,
customer_review_embedding BLOB,
INDEX vector_idx (customer_review_embedding) USING HNSW
) ENGINE=ColumnStore;
""")
# Sample reviews
reviews = [
(1, "This product exceeded my expectations. Highly recommended!", 0.9),
(1, "Decent quality, but pricey.", 0.6),
(2, "Terrible experience. The product does not work.", 0.1),
(2, "Average product, ok ok", 0.5),
(3, "Absolutely love it! Best purchase I have made this year.", 1.0)
]
# Load sample reviews into vector DB
for product_id, review_text, sentiment_score in reviews:
embedding = model.encode(review_text)
cursor.execute(
"INSERT INTO customer_reviews (product_id, review_text, sentiment_score, review_embedding) VALUES (%s, %s, %s, %s)",
(product_id, review_text, sentiment_score, embedding.tobytes()))
connection.commit()
connection.close()
如今,讓我們使用 MariaDB 的矢量功效來查找相似的批評(píng)。這更像是在問“其他主顧也說過相似的批評(píng)嗎? ”。在底下的示例中,我將找到相似于“我十分滿意! ”的客戶批評(píng)的前 2 條批評(píng)。為此,我使用最新版本中提供的矢量函數(shù) ( VEC_Distance_Euclidean ) 之一。
# Convert the target customer review into vector
target_review_embedding = model.encode("I am super satisfied!")
# Find top 2 similar reviews using MariaDB's VEC_Distance_Euclidean function
cursor.execute("""
SELECT review_text, sentiment_score, VEC_Distance_Euclidean(review_embedding, %s) AS similarity
FROM customer_reviews
ORDER BY similarity
LIMIT %s
""", (target_review_embedding.tobytes(), 2))
similar_reviews = cursor.fetchall()
總的來說,我印象深入! MariaDB 的矢量版將簡(jiǎn)化某些人工智能驅(qū)動(dòng)的架構(gòu)。它彌合了傳統(tǒng)數(shù)據(jù)庫(kù)天下與人工智能東西不休提高的需求之間的差距。在接下去的幾個(gè)月中,我渴望看到這項(xiàng)武藝怎樣成熟以及社區(qū)如安在實(shí)踐使用中接納它。
版權(quán)聲明:本文來自互聯(lián)網(wǎng)整理發(fā)布,如有侵權(quán),聯(lián)系刪除
原文鏈接:http://www.freetextsend.comhttp://www.freetextsend.com/shenghuojineng/57964.html