# Using *Cosymlib* to analyze the symmetry properties of molecules

### Pere Alemany <br> *Institut de Química Teòrica i Computacional de la Universitat de Barcelona (IQTCUB)*

                                                                                June 2023
                                                           
This series of tutorials is meant to introduce you to using *Cosymlib*, a python library for calculating continuous shape and symmetry measures (CShMs and CSMs). In each notebook we will show different ways of using the functions in the library. 

*Cosymlib* can be used either in a simple command-line mode or via python scripting. The main focus in these tutorial is to learn how to perform the basic tasks in *Cosymlib* using it in the command-line mode. At the end we will give some examples on how to use it in an advanced mode by including calls to *Cosymlib's* functions in your own Python scripts. All important tasks can, however, be done in the command-line mode and this is the recommended option for users that just need to compute a CShM or CMS occasionally. In this case, all you need is a file with your structure and a call one of the basic commands of the Cosymlib, without any need to know anything about programming in Python. 

To start using *Cosymlib* you just need to install it as you would do with any other Python library. This step in not necesary if *Cosymlib* is already installed in your computer. If you run the notebook in Google Colab, you can install the *Cosymlib* using the command in the cell below: 

In [None]:
!pip install cosymlib

*Cosymlib* allows for both continuous shape (CShMs)and symmetry measures(CSMs) for molecules described as a set of points in Euclidean 3D space. Besides this, you can also analyze the symmetry of the electronic structure of a given molecule by computing CSMs for its electronic density, wavefunction or individual molecular orbitals. All these tasks can be done by means of simple calls to the following commands:

*   **gsym**: Calculation of symmetry measures for a set of geometries
*   **cchir**: Calculation of chirality measures for a set of geometries
*   **shape**: Calculation of shape measures for a set of geometries
*   **shape_map**: Use two CShMs to compute a shape map for a set of geometries
*   **esym**: Calculation of symmetry measures for the electron density of a molecule
*   **mosym**: Pseudo-symmetry analysis for the molecular orbitals of a molecule

Besides this, there is also a general script that allows all types of calculations above:
*   **cosym**: Allows for all of the above measures


## How to execute a command ##

In order to run any of these commands you will need to supply *Cosymlib* with at least the geometry of a molecule giving the cartesian coordinates and the atomic symbols (or arbitrary labels) for each atom in the molecule. When you use the *Cosymlib* in command-line mode you just need to call the desired command, for instance, `gsym` and in the same line indicate the file containing the geometry or the electronic structure of your molecule or molecules. Eventually you may need to specify additional options that modify the behavior of the command. 

A general command line will, thus, look like:

`gsym   filename.xyz -m  C2 `

where `gsym` is the name of the command, `filename.xyz` the name of the file containing the structure we want to analyze, and `-m C2` is an optional argument indicating the task we want to perform.
Along these tutorials we will see examples of the main options of the scripts included in the *Cosymlib*. 



## Supported Files

*Cosymlib* is able to generate a Molecule object (a data structure containing all the necesary information on a molecule) by reading any of the following files that include the geometry or the electronic structure of molecular structures.

Geometry files:

* **.xyz**: With either Atomic Numbers or Atomic Symbols
* **.pdb**: Includes connectivity 
* **.cor**: Conquest formatted files

Geometry files can contain a single structure or multiple structures, the only limitation is that **all** these structures should contain the **same number of atoms** 

Geometry + Electronic Structure files: 

*  **.fchk**
*  **.molden**

*Note:* If the electronic stucture is not provided, *Cosymlib* can autogenerate one using an Extended Hückel calculation if needed. *Cosymlib* is also able to generate a .fchk file for this calculation so that you can afterwards visualize your molecular orbitals using external programs such as *Avogadro*. 

In order to proceed with the tutorial you will need to download first a few files containing molecular data that we have prepared in advance for you. If the *Cosymlib* is installed in your computer you will just need to have these files in the folder from where you run your command. 

In [None]:
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/ethane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/tetbrneopentane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/twistane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/vanadate.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/trisen_co.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/snub_dode.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/co_compl.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/sf6.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/h2o2.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/ferrocene_ec.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/c5_cp.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/cyclohex_chair.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/diborane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/difluoethene.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/ferrocene_st.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/ge9.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/methane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/dodecahedrane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/dodecaborate.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/cyclobutane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/crown_18_6.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/c70.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/c60.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/co_c2_compl.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/benzenetriol.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/ammonia.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/allene.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/adamantane.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/h2o.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/square.xyz
!wget https://raw.githubusercontent.com/GrupEstructuraElectronicaSimetria/cosymlib/pere_tutorial/docs/tutorials/h2o2_path.xyz


# Using Gsym to determine the point group for a molecule 


`gsym` is the script used to calculate continuous symmetry measures (CSMs) with respect to some simple point-symmetry groups for geometrical objects described as a set of vertices. Before computing actual CMSs, `gsym` can simply be employed to determine the point symmetry group for a molecule. For this purpose we can call `gsym` with the `-pg` option.

Before starting, let's take a look to the `h2o.xyz` file using the linux `cat` command:

In [None]:
!cat h2o.xyz

Just by looking at the coordinates it is difficult to guess the symmetry of the molecule since its is in an arbitrary position and orientation. This is, however, not important for the `gsym` script and it will find the pointgroup regardless of the position/orientation for which  we give the coordinates of the molecule:

In [None]:
!gsym h2o.xyz -pg

We have prepared a series of .xyz files with different molecules. If wou want to find their point group, just copy the command in the cell above into the next cell, change the name of the .xyz file and execute the cell.

You can use the `gsym` script to find the point group for a series of molecules contained in the same `.xyz` file. The only restriction is that all structures should have the same number of atoms. The `h2o2_path.xyz`file contains a series of structures for the H-O-O-H molecule in which we have changed the dihedral angle between 0º and 180º in steps of 10º. If you run the script you will see that the two planar configurations have a different point group, while all intermediate structures have the same symmetry.

In [None]:
!gsym h2o2_path.xyz -pg 

Remember that the determination of the pointgroup relies on a numerical threshold that is used to decide if the original structure and its image after applying the transformation do coincide or not. 
The default value for this threshhold is 0.01, but it can be modified with the '--pg_thresh eps' tag, where eps is a real number. The smaller eps, the stricter is the coincidence needed to detect the presence of a symmetry operation. 
We have included two structures for a square of hydrogen atoms in the 'square.xyz' file. In the first one we have a perfect square, while in the second one we have included a small deviation of the atoms out of the molecular plane. Take a look at the two structures using the 'cat' command.

In [None]:
!cat square.xyz

If we run now the `gsym` script to determine the point group of the two structures it will correctly detect that the second structure does not have perfect square symmetry: 

In [None]:
!gsym square.xyz -pg 

We can, however, relax the threshold to allow for small deviations of the symmetry increasing it to 0.02 and in this case, `gsym` will tell us that both structures have square symmetry. 

In [None]:
!gsym square.xyz -pg --pg_thresh 0.02

This example shows that you must be careful when using `gsym`, since the result does not guarantee that structure has exactly the detected symmetry, it just tells you that within numerical error, the structure has the detected symmetry. If you are not sure, you can try to run the program with smaller threshold values. The default 0.01 threshold is a good compromise for structural chemistry since you must be aware that the atomic positions are also subject to some experimental error, so that it makes no sense to run the program with too strict thresholds.  