We recommend users use Safari,Google Chrome or Sogou to browser the database.

What does the ImmuMethy database do?

What is the difference between the uniformly processed beta values and the pre-processed beta values?

What is r-beta curve?

How to query the database to get beta values and methylation profiles?

What is the meaning of the title fields of the basic description in the results?

What does the ImmuMethy database do?

img

ImmuMethy is a database of DNA methylation at single-cytosine resolution for human blood and immune cells. It is to show methylation level in term of beta value for over 480,000 or 860,000 methylation sites in human blood and immune cells with different disease states and tissue sources. Moreover, methylation plasticity and profile can be also analyzed for all sites. Based on methylation profile, differential methylation and methylation marker can be effectively evaluated.

In ImmuMethy, samples are categorized into different study datasets based on sample source, cell type and disease state. ImmuMethy particularly emphasizes on using large size of samples to illustrate methylation profile and plasticity, and this help to see the dynamic change of methylation levels under various experimental conditions.

Back to Index

What is the difference between the uniformly processed beta values and the pre-processed beta values?

img

The uniformly processed beta values are derived from a uniform processing of raw data in the IDAT file format. The values were further adjusted for Infinium I and Infinium II probe design to produce normalized beta values within array, thus facilitating the downstream comparison between methylation sites and methylation plasticity analysis.

The pre-processed beta values mean they were processed by data submitters and deposited in the GEO and ArrayExpress databases. These values are directly used because the corresponding raw IDAT files were not provided by data submitter. Generally, these data were also uniformly processed and normalized across samples within the same study (the same GEO series/GSE ID or ArrayExpress accession ID) by data submitter.

In the retrieved query results in Microsoft Excel file format, the pre-processed beta values are labeled "intact" in the column "Normalized beta value".

Back to Index

What is r-beta curve?

img

The r-beta means rounded beta value. In ImmuMethy, beta values, which range from 0 to 1, from uniformly processed data are rounded to two decimal points. Therefore, all beta values of a methylation site within a set of samples are split into 101 break points representing different methylation intensity. The sample ratios at each intensity are connected and constitute a r-beta curve. In ImmuMethy, multiple r-beta curves can be illustrated simultaneously for methylation profile comparison (see the example curves below).

As for r-beta curve, the larger of sample size, the smoother of the curve will become. Please note, in the study dataset of normal human peripheral blood (ID: PB_blood_normal), there are over 10,000 samples. Therefore, this dataset provides an excellent control for differential methylation and methylation marker analysis. However, large sample may potentially make the access speed slow down.

Please be aware, the rounded beta values are only used to illustrate methylation profiles of uniformly processed beta values. All the other analyses including statistics test use the complete but not rounded beta values.

Back to Index

How to query the database to get beta values and methylation profiles?

img

ImmuMethy supports requests using gene symbol (or alias), CpG site and chromosome position.

A basic operation and work flow is shown as follows.


1. Enter your query gene symbol, CpG sites or select a chromosome and input a range of chromosome position.

2. Select a study dataset. Just click on the plus sign before each cell type to expand or collapse all the study datasets. Multiple study datasets are allowed for selection.

3.Click on the link "Show" (see the figure below) to enter the methylation profile result page.

Option A:

Click on the submit button, and the beta value result page will pop up. Some basic description (see below explanation about result) about the queried methylation sites is shown in the page. In addition, just click on the download links in the page to show all beta values in an Excel file format. The pre-processed and/or the current uniformly processed beta values are included in the file and can be used for further analysis.

Option B:

Click on the link "Show" (see below figure) to enter the methylation profile result page.

In the result page for Option B, there are four sections.

Section 1. It shows r-beta curves, which indicate methylation profiles, or sample distribution based on rounded beta values. Therefore, the overall methylation intensity and the change ability of queried methylation sites across samples can be easily identified.

Section 2. It uses box plots to show beta value distributions.

Section 3. It provides new curve addition, thus facilitating to new profile comparisons.

Section 4. It shows the basic description about all queried sites (see FAQ below).

Please be aware, if there is no "Show" label, it means there are no uniformly processed data for the corresponding data set.

Back to Index

What is the meaning of the title fields of the basic description in the results?

img

Please see the explanation as follows.

Study dataset:The categorized immune study dataset in the database.

CpG site ID:A unique CpG locus identifier.

Class:The main immune cell types in the database.

Sub class:Sub-divisions of each main cell type based on the phenotype.

Tissue source:Tissue which the cells are derived from.

Healthy state(disease):Whether the cells are derived from normal individuals or patients with a specific disease.

Gene symbol:Gene associated with the CpG site.

GeneID:Entrez Gene ID.

Gene region feature:Gene region feature category (see below figure for more information).

Relation to CpG island:Relationship to canonical CpG island:Shores - 0-2 kb from CpG island; Shelves - 2-4 kb from CpG island (see below figure for more information).

Hg38:The position of queried site shown for Hg38 genome assembly.

SNP(s) located in the probe:rsids of SNPs located in the probe. Multiple listings of SNP rsids are allowed.

Distance of SNP(s) from query base of the probe:Distance of SNPs from query base of the probe. Multiple listings of the distance values are associated with rsid.

Minor allele frequency of SNP(s):Minor allele frequency (MAF) of SNP(s) from the 1000 Genomes Project. Multiple listings of the MAFs are associated with rsid.

Download link:A link to download all beta values in the queried study datasets.

1st Qu:The 1st quartile (25th percentile, 1st Qu, Q1) of AVBs (average beta values). Description only for methylation profile.

Median:The 2nd quartile (50th percentile, 2nd Qu, Q2) of AVBs. Description only for methylation profile.

3rd Qu:The 3rd quartile (75th percentile, 3rd Qu, Q3) of AVBs. Description only for methylation profile.

Quantile range (Q75-Q25):The difference between the 75th and 25th quantiles of the beta values. Description only for methylation profile.

Quantile range (Q95-Q5):The difference between the 95th and 5th quantiles of the beta values. Description only for methylation profile.

Mean:Mean of AVBs. Description only for methylation profile.

Standard deviation:Standard deviation (SD) of AVBs. It is a measure that is used to quantify the amount of variation or dispersion of the uniformly processed beta values in a study dataset. Description only for methylation profile.

Transcription factor: When there is 'see more', it indicates the site may mediate DNA-TF interaction. Click 'see more' to show the detailed information for predicted transcription factor binding. The binding information is from the MeDReader database.

Statistical results: P values for comparisons between the CpG site in the indicated row and those sites in the other rows. Description only for methylation profile.

Back to Index