Nueva publicación

Encontrar

Resumen
· 14 mayo, 2025

InterSystems Community Q&A Monthly Newsletter #48

Top new questions
Can you answer these questions?
#InterSystems IRIS
Error in iris-rag-demo
By Oliver Wilms
My AI use case - need help with Ollama and / or Langchain
By Oliver Wilms
Field Validation on INSERT error installing git-source-control
By Steve Pisani
Code Tables vs Cache SQL Tables in HealthShare Provider Directory
By Scott Roth
Problem deploying to a namespace
By Anthony Decorte
Error Handling Server to Client Side - Best Practices
By Michael Davidovich
Debugging %Net.HttpRequest
By Michael Davidovich
Problem exporting %Library.DynamicObject to JSON with %JSON.Adaptor
By Marcio Coelho
Setting Python
By Touggourt
SQLCODE 25 - Input Encountered after end of query
By Touggourt
How to access a stream property in a SQL trigger
By Pravin Barton
ERROR #5883: Item '%Test' is mapped from a database that you do not have write permission on
By Jonathan Perry
IntegratedML
By Touggourt
Embedded Python query
By Touggourt
Append a string in a update query
By Jude Mukkadayil
How can I call $System.OBJ.Load() from a linux shell script? (Or $System.OBJ.Import, instead)
By AC Freitas
IRIS xDBC protocol is not compatible error while python execution
By Ashok Kumar T
Best Practice for Existing ODBC Connection When System Becomes IRIS for HealthShare
By Fraser J. Hunter
web applications definitions syncing between primary and backup nodes
By Feng Wang
SQLCODE: -99 when executing dynamic SQL on specific properties
By Martin Nielsen
How to prevent reentrancy inside same process ?
By Norman W. Freeman
#InterSystems IRIS for Health
#Caché
#TrakCare
#HealthShare
#Health Connect
#48Monthly Q&A fromInterSystems Developers
Artículo
· 14 mayo, 2025 Lectura de 7 min

A Semantic Code Search Solution for TrakCare Using IRIS Vector Search

This article presents a potential solution for semantic code search in TrakCare using IRIS Vector Search.

Here's a brief overview of results from the TrakCare Semantic code search for the query: "Validation before database object save".

 

  • Code Embedding model 

There are numerous embedding models designed for sentences and paragraphs, but they are not ideal for code specific embeddings.

Three code-specific embedding models were evaluated: voyage-code-2, CodeBERT, GraphCodeBERT.  While none of these models were pre-trained for the ObjectScripts language, they still outperformed general-purpose embedding models in this context.

The CodeBERT was chosen as the embedding model for this solution. offering reliable performance without the need for an API key. 😁

class GraphCodeBERTEmbeddingModel:
    def __init__(self, model_name="microsoft/codebert-base"):
        self.tokenizer = RobertaTokenizer.from_pretrained(model_name)
        self.model = RobertaModel.from_pretrained(model_name)
    
    def get_embedding(self, text):
        """
        Generate a CodeBERT embedding for the given text.
        """
        inputs = self.tokenizer(text, return_tensors="pt", max_length=512, truncation=True, padding="max_length")
        with torch.no_grad():
            outputs = self.model(**inputs)
        # Use the [CLS] token embedding for the representation
        cls_embedding = outputs.last_hidden_state[:, 0, :].squeeze().numpy()
        return cls_embedding

 

  • IRIS Vector database

A table is defined with a VECTOR-typed column to store the embeddings. Please note that COLUMNAR index is not supported for VECTOR-typed column. 

CodeBERT embeddings have 768 dimensions. It can process texts of the maximal length of 512 tokens.

CREATE TABLE TrakCareCodeVector (
                file VARCHAR(150),
                codes VARCHAR(2000),
                codes_vector VECTOR(DOUBLE,768)
            ) 

  

  • Python DB-API  

The Python DB-API is used to establish a connection with IRIS instance to execute the SQL statements.

  1. Build the vector database for Trakcare source code.
  2. Retrieve the top_K highest DOT_PRODUCT code embeddings from IRIS vector database.
# build IRIS vector database
import iris
import os
from dotenv import load_dotenv

load_dotenv()

class IrisConn:
    """Connection with IRIS instance to execute the SQL statements """
    def __init__(self) -> None:
        connection_string = os.getenv("CONNECTION_STRING")
        username = os.getenv("IRISUSERNAME")
        password = os.getenv("PASSWORD")

        self.connection = iris.connect(
            connectionstr=connection_string,
            username=username,
            password=password,
            timeout=10000,
        )
        self.cursor = self.connection.cursor()

    def insert(self, params: list):
        try:
            sql = "INSERT INTO TrakCareCodeVector (file, codes, codes_vector) VALUES (?, ?, TO_VECTOR(?,double))"
            self.cursor.execute(sql,params)
        except Exception as ex:
            print(ex)
    
    def fetch_query(self, query: str):
        self.cursor.execute(query)
        return self.cursor.fetchall()

    def close_db(self):
        self.cursor.close()
        self.connection.close()
        
from transformers import AutoTokenizer, AutoModel, RobertaTokenizer, RobertaModel, logging
import torch
import numpy as np
import os
from db import IrisConn
from GraphcodebertEmbeddings import MethodEmbeddingGenerator
from IRISClassParser import parse_directory
import sys, getopt

class GraphCodeBERTEmbeddingModel:
    def __init__(self, model_name="microsoft/codebert-base"):
        self.tokenizer = RobertaTokenizer.from_pretrained(model_name)
        self.model = RobertaModel.from_pretrained(model_name)
    
    def get_embedding(self, text):
        """
        Generate a CodeBERT embedding for the given text.
        """
        inputs = self.tokenizer(text, return_tensors="pt", max_length=512, truncation=True, padding="max_length")
        with torch.no_grad():
            outputs = self.model(**inputs)
        # Use the [CLS] token embedding for the representation
        cls_embedding = outputs.last_hidden_state[:, 0, :].squeeze().numpy()
        return cls_embedding
        
class IrisVectorDB:
    def __init__(self, vector_dim):
        """
        Initialize the IRIS vector database.
        """
        self.conn = IrisConn()
        self.vector_dim = vector_dim

    def insert(self, description: str, codes: str, vector):
        params=[description, codes, f'{vector.tolist()}']
        self.conn.insert(params)

    def search(self, query_vector, top_k=5):
        query_vectorStr = query_vector.tolist()
        query = f"SELECT TOP {top_k} file,codes FROM TrakCareCodeVector ORDER BY VECTOR_COSINE(codes_vector, TO_VECTOR('{query_vectorStr}',double)) DESC"
        results = self.conn.fetch_query(query)
        return results
    
# Chatbot for code retrieval
class CodeRetrieveChatbot:
    def __init__(self, embedding_model, vector_db):
        self.embedding_model = embedding_model
        self.vector_db = vector_db
    
    def add_to_database(self, description, code_snippet, embedding = None):
        if embedding is None:
            embedding = self.embedding_model.get_embedding(code_snippet)
        self.vector_db.insert(description, code_snippet, embedding)
    
    def retrieve_code(self, query, top_k=5):
        """
        Retrieve the most relevant code snippets for the given query.
        """
        query_embedding = self.embedding_model.get_embedding(query)
        results = self.vector_db.search(query_embedding, top_k)
        return results
  • Code Chunks  

Since CodeBERT can process texts with the maximal length of 512 tokens. Large classes and methods have to be chunked into smaller parts. Each chunk is then embedded and stored in the vector database.

from transformers import AutoTokenizer, AutoModel, RobertaTokenizer, RobertaModel
import torch
from IRISClassParser import parse_directory

class MethodEmbeddingGenerator:
    def __init__(self, model_name="microsoft/codebert-base"):
        """
        Initialize the embedding generator with CodeBERT.

        :param model_name: The name of the pretrained CodeBERT model.
        """
        self.tokenizer = RobertaTokenizer.from_pretrained(model_name)
        self.model = RobertaModel.from_pretrained(model_name)
        self.max_tokens = self.tokenizer.model_max_length  # Typically 512 for CodeBERT
    def chunk_method(self, method_implementation):
        """
        Split method implementation into chunks based on lines of code that approximate the token limit.

        :param method_implementation: The method implementation as a string.
        :return: A list of chunks.
        """
        lines = method_implementation.splitlines()
        chunks = []
        current_chunk = []
        current_length = 0
        for line in lines:
            # Estimate tokens of the line
            line_token_estimate = len(self.tokenizer.tokenize(line))
            if current_length + line_token_estimate <= self.max_tokens - 2:
                current_chunk.append(line)
                current_length += line_token_estimate
            else:
                # Add the current chunk to chunks and reset
                chunks.append("\n".join(current_chunk))
                current_chunk = [line]
                current_length = line_token_estimate

        # Add the last chunk if it has content
        if current_chunk:
            chunks.append("\n".join(current_chunk))

        return chunks

    def get_embeddings(self, method_implementation):
        """
        Generate embeddings for a method implementation, handling large methods by chunking.

        :param method_implementation: The method implementation as a string.
        :return: A list of embeddings (one for each chunk).
        """
        chunks = self.chunk_method(method_implementation)
        embeddings = {}

        for chunk in chunks:
            inputs = self.tokenizer(chunk, return_tensors="pt", truncation=True, padding=True, max_length=self.max_tokens)
            with torch.no_grad():
                outputs = self.model(**inputs)
                # Use the [CLS] token embedding (index 0) as the representation
                cls_embedding = outputs.last_hidden_state[:, 0, :].squeeze(0)
                embeddings[chunk] = cls_embedding.numpy()

        return embeddings

    def process_methods(self, methods):
        """
        Process a list of methods to generate embeddings for each.

        :param methods: A list of dictionaries with method names and implementations.
        :return: A dictionary with method names as keys and embeddings as values.
        """
        method_embeddings = {}
        for method in methods:
            method_name = method["name"]
            implementation = method["implementation"]
            print(f"Processing method embedding: {method_name}")
            method_embeddings[method_name] = self.get_embeddings(implementation)
        return method_embeddings
  • UI - The Angular APP  

The stack uses Angular as the frontend and Python (Flask) as the backend.

 

  • Future Directions

The searching result is not perfec because the embedding model is not pre-trained for objectscripts. 

Comentarios (0)1
Inicie sesión o regístrese para continuar
Pregunta
· 14 mayo, 2025

Why Legal Research Matters

Legal writing isn't just about opinion—it's about persuasion through precedent, legislation, and scholarly analysis. A strong essay requires:

  • Citing relevant statutes and case law
  • Understanding jurisdictional differences
  • Using up-to-date sources
  • Applying legal principles accurately

Poor research leads to weak arguments, lower marks, and missed learning opportunities.

Common Challenges Law Students Face

Even bright students can struggle with:

  • Finding authoritative sources
  • Understanding legal databases
  • Distinguishing between primary and secondary sources
  • Structuring research into coherent arguments

These challenges are amplified by tight deadlines and competing coursework.

How Law Essay Pros Can Help

We don’t just write — we research with precision. Our writers are trained in legal research methods and have access to trusted legal databases. Here's what we bring to the table:

  • Thorough Legal Analysis: Every essay is supported by relevant, current, and credible legal authorities.
  • Jurisdiction-Specific Accuracy: Whether you're studying UK, US, Canadian, or Australian law, we tailor the research accordingly.
  • Academic Standards Compliance: We follow university-level citation styles (OSCOLA, Bluebook, AGLC, etc.) and formatting guidelines.

Boost Your Grades With Stronger Foundations

Better research means better essays. Many students turn to Law Essay Pros not just for writing support, but to understand how proper legal analysis strengthens their work. With our guidance, you’ll learn:

  • How to frame arguments using real case law
  • How to interpret and apply legislation
  • How to write more persuasively, using research as your backbone

Conclusion

Legal research is a skill that defines success in both academia and practice. At Law Essay Pros, we help law students master this skill by providing high-quality, research-rich academic support. Whether you're writing an essay, case brief, or dissertation, we ensure your arguments are well-supported and academically sound.

Comentarios (0)1
Inicie sesión o regístrese para continuar
Artículo
· 14 mayo, 2025 Lectura de 7 min

OMOP Odyssey - GCP Healthcare API Real Time FHIR® to OMOP Transformation ( RealTymus )

Real Time FHIR® to OMOP Transformation

This part of the OMOP Journey,  we reflect before attempting to challenge Scylla on how fortunate we are that InterSystems OMOP transform is built on the Bulk FHIR Export as the source payload.  This opens up hands off interoperability with the InterSystems OMOP transform across several FHIR® vendors, this time with the Google Cloud Healthcare API.

Google Cloud Healthcare API FHIR® Export

GCP FHIR® Datastores support bulk fhir import/export from the cli or api, the premise is simple and the docs are over exhaustive, we'll save a model the trouble of training on it again and link it if interested.  The more valuable thing to understand of the heading of this paragraph is the implementation of the bulk fhir export standard itself.

Important differentiators with Google's implementation of the FHIR® Export are namely, Resource Change Notification via Pub/Sub and the ability to specify incremental exports.

Real Time? ⏲

Yes! Ill die on this sword I guess.  Its not only my rap handle, but the mechanics are definitely there to back a good technical argument to be able to say...

"As a new Organization gets created to FHIR, we transform it, and add it to the InterSystems OMOP CDM in the same stroke as a care_site/location."

Walkthrough

Trying to make this short and to the point and encapsulates how a pub/sub notification coupled with a cloud function can glue these two solutions together and automate your OMOP ingestion at a granular level.

Step One: Wire Up InterSystems OMOP to AWS Bucket

This step is becoming a repetitive in posts in this community, so I will go warp speed through the steps.

  • Procure AWS S3 Bucket
  • Launch InterSystems OMOP, Add Bucket Configuration
  • Eject Policy from InterSystems OMOP Deployment
  • Apply Policy to the AWS S3 Bucket

 

I dunno, the steps and image seemed to work out better in my head, but maybe not.  Here are the docs and here is a more in depth way to get this taken care of in this series with better examples.

Step Two: Add Pub/Sub Target in Google Cloud Healthcare API

As mentioned previous, a foundational piece to making this work is the super great feature that notifies on Resource changes in the data store.  You will find this option on setup in the dialog and is also available post configuration.  I typically like to check both options to have as much data in the notification as possible to play with.  For instance with Deletes, you can include the deleted resource in the notification as well, really great for EMPI solutions.

 

Step Three: Cloud Function ⭐

The cloud function puts in the work, and the SOW for that looks a little bit like this.

Listen for FHIR resource change pub/sub notifications of type Organization on the create method, and export the data store incrementally from the time the event fired.  Since the export function only supports a GCS target, read in the created export and create fhir export zip file that zips the ndjson files into the root of the zip file and push the created zip file to an aws bucket. 

Re-stating the second feature that makes this especially great, is the ability to export from an specific date and time, meaning we do not need to export the entire dataset.  For this we will use the time we received the event, tack a minute or so on it, in hopes the export, import and transform steps will be smaller and of course, more timely.

 
realtimefhir2omop.py

Step Four: What is Happening right now? 🔥

To split what is going on, lets inspect the real time processing with some screenshots at each point.

FHIR Organization Created

Pub/Sub Event is Published

 
Pub/Sub FHIR Event

Cloud Function Receives Resource Event from Subscription

Cloud Function Exports the FHIR Store GCS

Cloud Function Creates ZIP from GCS and Pushes to AWS

InterSystems OMOP Transforms FHIR to OMOP

Organization Available as Care Site in CDM

When did that FHIR Resource get transformed to the CDM ?

YARN | Now. You're looking at now. Everything that happens now is happening  now. | Spaceballs (1987) | Video gifs by quotes | 1606b976 | 紗

Step Four: Validation Fun ✔

Fun with OBS and Not so Much fun with Audio


 

In Conclusion
 

Did something similar last year at MIT Grand Hack, using the same design pattern, but with Questionairre/Response resource and Gemini in the middle of things.

Gemini FHIR Agent MIT Grand Hack

Comentarios (0)1
Inicie sesión o regístrese para continuar
Anuncio
· 14 mayo, 2025

L'examen InterSystems IRIS Development Professional est désormais disponible !

Bonjour à tous,

L'équipe Certification d'InterSystems Learning Services est heureuse d'annoncer la sortie de notre nouvel examen InterSystems IRIS Development Professional. Il est désormais disponible à l'achat et à la réservation dans le catalogue d'examens InterSystems. Les candidats potentiels peuvent consulter les sujets d'examen et les questions d'entraînement pour se familiariser avec les approches et le contenu de l'examen. Les candidats ayant réussi l'examen recevront un badge de certification numérique à partager sur les réseaux sociaux comme LinkedIn. Si vous débutez avec la certification InterSystems, veuillez consulter les pages de notre programme qui incluent des informations sur les examens, les règles d'examen, la FAQ et bien plus encore.

Si vous avez des idées pour créer de nouvelles certifications susceptibles de vous aider à faire progresser votre carrière, l'équipe Certification d'InterSystems Learning Services est toujours à l'écoute de vos suggestions. N'hésitez pas à nous contacter à l'adresse certification@intersystems.com pour nous faire part de vos idées.

Au plaisir de célébrer votre réussite,
@Celeste Canzano - Spécialiste des opérations de certification, InterSystems

Comentarios (0)1
Inicie sesión o regístrese para continuar