Friday, November 21, 2008

ChemSharp - should have googled it first

Then I would have known about this...






"The safest way to sharpen tungsten without grinding."

Thursday, November 13, 2008

CSInChI v0.5 Released

The first product of the ChemSharp project is now available to the public. The CSInChI library allows programmers to call the IUPAC InChI library from CLR languages. It is compatible with Iron Python as well although Python programmers should read up on how IPy handles value types and using the clr.Reference class with methods that take out and ref parameters.

CSInChI is designed as a stand alone library which is used by ChemSharp but not dependent on the rest of the project.

This is a beta release so it's a little rough and people should expect that breaking changes may be made between now and the eventual 1.0 release. Using the default constructors of the structs and then initalizing the fields will be the best way to ensure compatibility with future releases.

The next few posts will contain examples of how to use this library. More examples are included in the documentation.

CSInChI can be downloaded from this link.

Questions and comments can be directed either to me or to the CSInChI mailing list at chemsharp-csinchi@lists.sourceforge.net

Wednesday, November 12, 2008

Languages available for use with the CLR

A recent post to the OpenBabel mailing list reminded me that many scientists who write .Net code are not fully aware of the wide array of compatible languages. In fact most popular programming languages and many not so popular ones have been ported to the CLR. The list includes:

A# (Ada)
NetCobol
IronRuby
S# (Small Talk)
FTN95 (Fortran)
F# (OCaml)

and many more...

A fairly complete list is posted here.

Sunday, November 9, 2008

Science Code .Net and Numerical Recipes

Here is an interesting project:

http://www.sciencecode.com/

It seems to be an effort to implement the classic Numerical Recipes and provide classes to do some common Physics/Math calculations from C#. When I get a chance I'll be trying it out and posting a review. If any one has some experience with it comment and let me know what you thought of it.

Friday, November 7, 2008

Interop Example: Marshalling Structures To The InChI Library

For the last week I've been finishing up CSInChI a library for using the IUPAC InChI library from C#. For those not acquainted with it, the InChI (International Chemical Identifier) is the a line notation used to represent molecular structures. Line notations are simply ways of encoding a structure as text string. Since the official InChI api provided by the IUPAC is written in C I thought this would be a good time to post an interop example. This tutorial will illustrate how to call an unmanaged function that takes structures as parameters using Platform Invoke.

The InChI library can be downloaded from: http://www.iupac.org/inchi/

In this example we'll tackle the function:

int GetStructFromINCHI(inchi_InputINCHI *inpInChI, inchi_OutputStruct *outStruct)

This function takes 2 C structs as parameters and returns an integer error code. The first one holds two strings, the inchi and a string of options.

typedef struct tagINCHI_InputINCHI {
/* the caller is responsible for the data allocation and deallocation */
char *szInChI; /* InChI ASCIIZ string to be converted to a strucure */
char *szOptions; /* InChI options: space-delimited; each is preceded by */
/* '/' or '-' depending on OS and compiler */
} inchi_InputINCHI;



We'll begin by creating a matching C# structure

public struct InChI_String_Input
{
public string inchiString;
public string options;
}


In this case we get surprisingly lucky and this struct marshals just fine with no additional attributes. The key thing here is to make sure that the fields are listed in the same order as in the unmanaged structure and that the type of each field is the same size as the C type. By default the C# compiler lays out the fields of a struct sequentially. If you want to use a class you must apply the [StructLayout(LayoutKind.Sequential)] attribute.

The C struct that holds the output from the function looks like this:

typedef struct
tagINCHI_OutputStruct {
inchi_Atom *atom;
inchi_Stereo0D
*stereo0;
S_SHORT num_atoms;
S_SHORT num_stereo0;
char *szMessage;
char *szLog;
unsigned long
WarningFlags[2][2];

}inchi_OutputStruct;

A C# equivalent looks something like this:
using System;
using System.Runtime.Interop;

public struct InChI_Struct_Output
{
public IntPtr AtomsPtr;
public IntPtr StereoPtr;

public short NumAtoms;
public short NumStereo0D;

public string Message;
public string Log;

[
MarshalAs(UnmanagedType.ByValArray, SizeConst = 4)]
public ulong[] WarningFlags;
}


The details of the inchi_Atom and inchi_Stereo0D structures will be discussed in a future post. For now we're only going worry about how to marshal arrays. Because a C style array is represented by a pointer to the first item in the array the C# equivalent is the IntPtr class from the InteropServices namespace. The WarningFlags array has to be changed to a 1-D array with the same total capacity because the CLR does not support marshaling nested arrays. Note that the MarshalAs attribute specifies a size. This is required both because the array has a fixed size in C and because the CLR needs to know the runtime size of an array in order to marshal it.

To convert the pointer representing a C style array to an array of C# structures write a method that takes the pointer and increments it by the size of the structure it represents calling the Marshal.PtrToStructure method at each iteration to convert the pointer to a C# structure.

public InChI_Atom[] GetAtoms()
{
int atomSize = Marshal.SizeOf(typeof(InChI_Atom));
InChI_Atom[] iAtoms = new InChI_Atom[NumAtoms];

InChI_Atom a;
IntPtr pAtom = AtomsPtr;

for (int i = 0; i < iAtoms.Length; i++)
{
a = (InChI_Atom)Marshal.PtrToStructure(pAtom, typeof(InChI_Atom));
iAtoms[i] = a;
pAtom = new IntPtr((int)pAtom + atomSize);
}
return iAtoms;
}


Finally we create a class to hold the methods that access the unmanged dll.

public static class LibInChI
{
[DllImport("libinchi.dll", EntryPoint = "GetStructFromINCHI")]
public static extern int ParseInChI(ref InChI_String_Input input, out InChI_Struct_Output output)
...
...
}


To call the method:

//All fields need to be set to non-null values
InChI_String_Input inp;
inp.Options = "";
string inchi = "InChI=1/H3N/h1H3";

InChI_Struct_Output outStruct;

int retVal = LibInChI.ParseInChI(ref inp, out outStruct);

Note the use of the ref and out keywords. When ref and out parameters are marshaled they are interpreted as &theParam. Remember that if a method has any ref or out parameters the keywords must be explicitly specified each time the method is called.

Thats it for today. The next interop example will look at the InChI_Atom struct and how to ensure that unmanaged resources are freed. For those who are interested, CSInChI will be available within the next week (fingers crossed!) from the ChemSharp project. http://sourceforge.net/projects/chemsharp