实验8:Shell程序开发(2):PubMed文献批量下载程序
一、实验目的
1. 了解NCBI的PubMed文献资源库;
2. 掌握利用Shell脚本从PubMed批量下载文献信息的方法。
二、实验环境
1. 操作系统:客户端Windows,服务器端Linux
2. 主要软件:putty
三、实验原理
PubMed是由美国国家医学图书馆(NLM)的国家生物技术信息中心(NCBI)开发的基于
Web的检索系统,通过NCBI平台提供基于Web的免费MEDLINE数据库检索服务,并提供
部分免费的全文链接服务,此外还可以访问NCBI维护的完整的分子生物学数据库.
1999年8月PubMed加入NCBI开发的Entrez通用浏览器,更换了检索界面。
NCBI提供批量下载工具efetch(http://www.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html),
可以批量下载基因序列、蛋白质序列、文献摘要等。如:
wget "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=7764678\
&retmode=text&rettype=medline"
(注意,地址两侧要用引号,否则wget会认为是由“&”分割的多个地址)即可下载PubMed ID为7764678的文献
的相关信息:
PMID- 7764678
OWN - NLM
STAT- MEDLINE
DA - 19940606
DCOM- 19940606
LR - 20081121
IS - 8756-7938 (Print)
IS - 1520-6033 (Linking)
VI - 10
IP - 2
DP - 1994 Mar-Apr
TI - Intermolecular electrostatic interactions and their effect on flux and protein
deposition during protein filtration.
PG - 207-13
AB - Although membrane filtration is used extensively to process protein solutions
containing a variety of electrolytes, there is currently little fundamental understanding
of the effect of the solution environment (and in particular, the solution pH) on the
filtrate flux in these systems. We have obtained data for the flux and sieving coefficients
during the batch (stirred cell) filtration of solutions of bovine serum albumin,
immunoglobulins, hemoglobin, ribonuclease A, and lysozyme through 0.16-micron microfiltration
membranes at different pH values. The flux declined significantly for all five proteins
due to the formation of a protein deposit on the upper surface of the membrane. The
quasi-steady ultrafiltrate fluxes at the individual protein isoelectric pH's were essentially
identical, despite the large differences in molecular weight and physicochemical characteristics
of these proteins. The flux increased at pH's away from the isoelectric point, with the data
well-correlated with the protein surface charge density. These results were explained in terms
of a simple physical model in which the protein deposit continues to grow, and thus the flux
continues to decline, until the drag force on the proteins associated with the filtrate flow
is no longer able to overcome the intermolecular repulsive interactions between the proteins
in the bulk solution and those in the protein deposit on the surface of the membrane.
AD - Department of Chemical Engineering, University of Delaware, Newark 19716.
FAU - Palecek, S P
AU - Palecek SP
FAU - Zydney, A L
AU - Zydney AL
LA - eng
GR - R01-HL-39455-02/HL/NHLBI NIH HHS/United States
PT - Journal Article
PT - Research Support, U.S. Gov't, P.H.S.
PL - UNITED STATES
TA - Biotechnol Prog
JT - Biotechnology progress
JID - 8506292
RN - 0 (Membrane Proteins)
RN - 0 (Proteins)
SB - B
MH - Chemistry, Physical
MH - Electrochemistry
MH - Hydrogen-Ion Concentration
MH - Isoelectric Focusing
MH - Membrane Proteins/chemistry
MH - Models, Chemical
MH - Molecular Weight
MH - Physicochemical Phenomena
MH - Protein Conformation
MH - Proteins/*chemistry
MH - Ultrafiltration
EDAT- 1994/03/01
MHDA- 1994/03/01 00:01
CRDT- 1994/03/01 00:00
AID - 10.1021/bp00026a010 [doi]
PST - ppublish
SO - Biotechnol Prog. 1994 Mar-Apr;10(2):207-13.
wget功能强大,提供了很多参数,如-q(不输出提示)、-O(输出到文件,后跟文件名,如果跟 - 表示
标准输出)、-i(从文件读取url地址)等。
四、实验内容
1. 在自己的主目录下的linux/exp目录中新建目录exp_8;
2. 在目录exp_8中编辑Shell脚本get_pubmed.sh,实现下载PubMed ID为1768001到 1768010的
文献信息,要求程序运行时提供两个命令行参数,第1个命令行参数是第一篇文献的PubMed ID,第2个
命令行参数是最后一篇文献的PubMed ID。
五、实验报告
1.实验环境(包括操作系统和软件),实验步骤,结果文件记录;
2.实验中遇到的问题,如何解决的。