Nueva publicación

Encontrar

Ten en cuenta que esta publicación está obsoleta.
InterSystems Official
· 14 jun, 2023

2023 年 6月 13 日 - 勧告:プロセスメモリ使用量の増加

インターシステムズは、InterSystems IRIS 製品でプロセスメモリの使用量が増加する不具合を修正しました。

 

対象バージョン: 
  InterSystems IRIS                      2022.2, 2022.3, 2023.1.0
  InterSystems IRIS for Health   2022.2, 2022.3, 2023.1.0
  HealthShare Health Connect   2022.2, 2022.3, 2023.1.0
  Healthcare Action Engine         2022.1


  
対象プラットフォーム: すべて

 

問題の詳細:
ローカル変数に対して $Order$Query または Merge を実行する際に、プロセスのローカル変数テーブルのメモリ消費量の増加が発生します。 この問題は、ほとんどの実行環境では悪影響を与えませんが、プロセス数が多い環境、またはプロセス当たりの最大メモリを厳密に制限している環境では、影響を受ける可能性があります。 また、一部のプロセスで <STORE>エラーが発生する場合があります。

 

解決方法:
この問題は修正 ID : DP-423127 および DP-423237 で解決します。
これらの修正は、今後のすべてのバージョンに含まれる予定です。 

また、既に公開されていた InterSystems IRIS 2023.1.0.229.0 はこの修正を含むバージョン InterSystems IRIS 2023.1.0.235.1 に更新されました。
 
お客様のご要望により、修正を現在お使いの EM リリースの製品に対するパッチとして個別に作成してご提供することが可能です。お使いのシステムに対するパッチが必要な場合は、バージョン情報とライセンスキー情報をご確認の上インターシステムズカスタマーサポートセンターまでお知らせ下さい。この勧告について質問がある場合は、インターシステムズカスタマーサポートセンターまでご連絡下さい。

Comentarios (0)1
Inicie sesión o regístrese para continuar
Artículo
· 14 jun, 2023 Lectura de 2 min

LangChain Ghost in the PDF

Posing a question to consider during the current Grand Prix competition.

I wanted to share an observation about using PDFs with LangChain.

When loading the text out of a PDF, I noticed there was an artifact of gaps within some of the words extracted.

For example (highlighted in red)

Adapti ve Analytics is an optional e xtension that pro vides a b usiness-oriented, virtual data model layer\nbetween InterSystems IRIS and popular Business Intelligence (BI) and Artificial Intelligence (AI) client tools. It includes\nan intuiti ve user interf ace for de veloping a data model in the form of virtual cubes  where data can be or ganized, calculated\nmeasures consistently defined, and data fields clearly named. By ha ving a centralized common data model, enterprises\nsolve the problem of dif fering definitions and calculations to pro vide their end users with one consistent vie w of b usiness\nmetrics and data characterization.

It was concerning this would affect:
1) The quality of document search for related content
2) The ability of OpenAI model to generate answers

What might be needed to stitch these words back together to improve things?

Could this use a word dictionary?

What would be the risk of linking two seperate words together.

Pushing ahead the unanticipated outcome was:

  • It didn't make a difference to either the document search or the ability to generate answers.

I suspect this is down to the way that OpenAI encoding and tokenizing operate.
The number of tokens is always higher than the number of words.
So tokens are already like "partial" words where tokens follow one another.
Thus the spaces in the middle of words didn't affect the answer.

Please share your experiences of Ghosts / Curious effects when using LangChain with IRIS.

Comentarios (0)2
Inicie sesión o regístrese para continuar
Artículo
· 13 jun, 2023 Lectura de 2 min

OEX mapping #2

Technology Strategy

When I started this project I had set myself limits:
Though there is a wide range of almost ready-to-use modules in various languages
and though IRIS has excellent facilities and interfaces to make use of them
I decided to solve the challenge "totally internal" just with embedded Python, SQL, ObjectScript
Neither Java, nor Nodes, nor Angular, PEX, ... you name it.
The combination of embedded Python and SQL is preferred. ObjectScript is just my last chance.

I was especially impressed how easy reading an HTTPS page with Python was.
On the other hand, I left Unit Test and Global Merge and Object Property Setter in COS 

Add on after 1st release

The fact that the initial load took about 50 min was rather shocking to have 730 records in the end.
So kind of a QUICK preload was added. In practical work only the first page and eventually during a contest
the 2nd page of the directory holds new entries. The rest is almost static, not to say frozen.

Loading a page 1 and 2  is mostly sufficient to get all new packages
Then loading DETAILS for the few newbies is not worth mentioning.

Collecting results with SQl is an easy exercise but pivoting a cube is a bit more comfortable
So I added today classic IRIS Analytics to my package.
It's enabled in Namespace USER and is named OEX  similar to the first Pivot to start with

After starting the container the Unit Test leaves a test set of page 1 with ~30 records
Which is also the initial content of the Cube.

-

If you decide to run a completely fresh load it is up to you to rebuild the cube in Analytics Architect.

While using the QUICK variant the final step is a rebuild of the cube and you get this result.

So whether you intend to use SQL or Analytics is your decision.

I count on your votes in the contest
 

Comentarios (0)1
Inicie sesión o regístrese para continuar
Artículo
· 13 jun, 2023 Lectura de 3 min

LangChain on InterSystems PDF documentation

Yet another example of applying LangChain to give some inspiration for new community Grand Prix contest.

I was initially looking to build a chain to achieve dynamic search of html of documentation site, but in the end it was simpler to borg the static PDFs instead.

Create new virtual environment

mkdir chainpdf

cd chainpdf

python -m venv .

scripts\activate 

pip install openai
pip install langchain
pip install wget
pip install lancedb
pip install tiktoken
pip install pypdf

set OPENAI_API_KEY=[ Your OpenAI Key ]

python

Prepare the docs

import glob
import wget;

url='https://docs.intersystems.com/irisforhealth20231/csp/docbook/pdfs.zip';
wget.download(url)
# extract docs
import zipfile
with zipfile.ZipFile('pdfs.zip','r') as zip_ref:
  zip_ref.extractall('.')

# get a list of files
pdfFiles=[file for file in glob.glob("./pdfs/pdfs/*")]

Load docs into Vector Store

import lancedb
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import LanceDB
from langchain.document_loaders import PyPDFLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts.prompt import PromptTemplate
from langchain import OpenAI
from langchain.chains import LLMChain


embeddings = OpenAIEmbeddings()
db = lancedb.connect('lancedb')
table = db.create_table("my_table", data=[
    {"vector": embeddings.embed_query("Hello World"), "text": "Hello World", "id": "1"}
], mode="overwrite")

documentsAll=[]
pdfFiles=[file for file in glob.glob("./pdfs/pdfs/*")]
for file_name in pdfFiles:
  loader = PyPDFLoader(file_name)
  pages = loader.load_and_split()
  # Strip unwanted padding
  for page in pages:
    del page.lc_kwargs
    page.page_content=("".join((page.page_content.split('\xa0'))))
  documents = CharacterTextSplitter().split_documents(pages)
  # Ignore the cover pages
  for document in documents[2:]:
    documentsAll.append(document)

# This will take couple of minutes to complete
docsearch = LanceDB.from_documents(documentsAll, embeddings, connection=table)

Prep the search template

_GetDocWords_TEMPLATE = """Answer the Question: {question}

By considering the following documents:
{docs}
"""

PROMPT = PromptTemplate(
     input_variables=["docs","question"], template=_GetDocWords_TEMPLATE
)

llm = OpenAI(temperature=0, verbose=True)

chain = LLMChain(llm=llm, prompt=PROMPT)

Are you sitting down... Lets talk with the documentation

"What is a File adapter?"

# Ask the queston
# First query the vector store for matching content
query = "What is a File adapter"
docs = docsearch.similarity_search(query)
# Only using the first two documents to reduce token search size on openai
chain.run(docs=docs[:2],question=query)

Answer:

'\nA file adapter is a type of software that enables the transfer of data between two different systems. It is typically used to move data from one system to another, such as from a database to a file system, or from a file system to a database. It can also be used to move data between different types of systems, such as from a web server to a database.

"What is a lock table?"  

# Ask the queston # First query the vector store for matching content
query = "What is a locak table"
docs = docsearch.similarity_search(query)
# Only using the first two documents to reduce token search size on openai
chain.run(docs=docs[:2],question=query)

Answer:

'\nA lock table is a system-wide, in-memory table maintained by InterSystems IRIS that records all current locks and the processes that have owned them. It is accessible via the Management Portal, where you can view the locks and (in rare cases, if needed) remove them.'

 

Will leave as a future exercise to format an User interface on this functionality

4 comentarios
Comentarios (4)3
Inicie sesión o regístrese para continuar
Artículo
· 12 jun, 2023 Lectura de 3 min

LangChain fixed the SQL for me

This article is a simple quick starter (what I did was) with SqlDatabaseChain.

Hope this ignites some interest.

Many thanks to:

sqlalchemy-iris author @Dmitry Maslennikov

Your project made this possible today.

 

The article script uses openai API so caution not to share table information and records externally, that you didn't intend to.

A local model could be plugged in , instead if needed.

 

Creating a new virtual environment

mkdir chainsql

cd chainsql

python -m venv .

scripts\activate

pip install langchain

pip install wget

# Need to connect to IRIS so installing a fresh python driver
python -c "import wget;url='https://raw.githubusercontent.com/intersystems-community/iris-driver-distribution/main/DB-API/intersystems_irispython-3.2.0-py3-none-any.whl';wget.download(url)"

# And for more magic
pip install sqlalchemy-iris

pip install openai

set OPENAI_API_KEY=[ Your OpenAI Key ]

python

 

Initial Test

from langchain import OpenAI, SQLDatabase, SQLDatabaseChain

db = SQLDatabase.from_uri("iris://superuser:******@localhost:51775/USER")

llm = OpenAI(temperature=0, verbose=True)

db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

db_chain.run("How many Tables are there")

Result error

sqlalchemy.exc.DatabaseError: (intersystems_iris.dbapi._DBAPI.DatabaseError) [SQLCODE: <-25>:<Input encountered after end of query>]
[Location: <Prepare>]
[%msg: < Input (;) encountered after end of query^SELECT COUNT ( * ) FROM information_schema . tables WHERE table_schema = :%qpar(1) ;>]
[SQL: SELECT COUNT(*) FROM information_schema.tables WHERE table_schema = 'public';]
(Background on this error at: https://sqlalche.me/e/20/4xp6)
←[32;1m←[1;3mSELECT COUNT(*) FROM information_schema.tables WHERE table_schema = 'public';←[0m>>>

Inter-developer dialogue

IRIS didn't like being given SQL queries that end with a semicolon.

What to do now? ?

Idea: How about I tell LangChain to fix it for me

Cool. Lets do this !!

 

Test Two

from langchain import OpenAI, SQLDatabase, SQLDatabaseChain

from langchain.prompts.prompt import PromptTemplate

_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.

Use the following format:

Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"

The SQL query should NOT end with semi-colon
Question: {input}"""

PROMPT = PromptTemplate(
     input_variables=["input", "dialect"], template=_DEFAULT_TEMPLATE
)

db = SQLDatabase.from_uri("iris://superuser:******@localhost:51775/USER") llm = OpenAI(temperature=0, verbose=True)

llm = OpenAI(temperature=0, verbose=True)

db_chain = SQLDatabaseChain(llm=llm, database=db, prompt=PROMPT, verbose=True) 

db_chain.run("How many Tables are there")

 

Result Two

SQLQuery:←[32;1m←[1;3mSELECT COUNT(*) FROM information_schema.tables←[0m
SQLResult: ←[33;1m←[1;3m[(499,)]←[0m
Answer:←[32;1m←[1;3mThere are 499 tables.←[0m
←[1m> Finished chain.←[0m
'There are 499 tables.'

I said it would be quick.

 

References:

https://walkingtree.tech/natural-language-to-query-your-sql-database-usi...

https://python.langchain.com/en/latest/modules/chains/examples/sqlite.ht...

https://python.langchain.com/en/latest/modules/agents/plan_and_execute.html

7 comentarios
Comentarios (7)3
Inicie sesión o regístrese para continuar