正在加载图片...
Published online June 5,2007 DUDE:A User-Friendly Crop Information System Weikai Yan*and Nicholas A.Tinker ABSTRACT any data ical ound Timkr d Yan (200) Dist (UDE). erts data fror the con exe n e has and data for data and (v) nd co rve ith Wind f om the author queryin for achievin rm an Again e to lone-term o out the arge me th nped to olve both of thes ereby pro wo types of info data imp thro and to ano rait(varial cta200,20 Yan.2001 Yan nd ra 2002 bs s of d tabas (v)to d fo ing histo al he DUDE information system (DUDE)that can facilitate the as inker and Yan (2006) wed crop system search community. on w hich a cron info VIEWING AND EDITING A COOL DATABASE ion Lib When DUDE is opnn pane wo op )it a ata fro a c for umber of variables ( d.wn a Co and oti nd locations which ar nd a table be displayed in the Mic rface alld n the ght.One important use of the Edit mode is tha rch Centre (ECORC).Agricuu data queryOOL DC6.ECORC thier shou 6 ON Ca Corresponding author (yanw@agr.gcca) inte ity of the e and its com Published in Agron..9:1029-1033(2007). COOL DUD out opening the database using MS Access 80 WI 53711 USA 1029DUDE: A User-Friendly Crop Information System Weikai Yan* and Nicholas A. Tinker ABSTRACT A crop information system facilitates the storage, retrieval, and utilization of historical crop performance data. Here, we report a user￾friendly crop information system entitled Data Unification and Distillation Engine (DUDE), which (i) converts data from spreadsheet formats to a relational database, (ii) allows editing of tables in the database, (iii) simplifies the construction and execution of queries from the database, (iv) formats queried data into a variety of file struc￾tures for data analyses, and (v) simplifies maintenance and corrections of data in the database. DUDE runs on any personal computer that operates with Windows 2000 or later versions of Microsoft Windows. DUDE is freely available from the authors. D ATA FROM CROP PERFORMANCE TRIALS are not only essential for achieving short-term goals of selecting and recom￾mending crop cultivars, they are also valuable in addressing long-term questions about the target environments and the crop as an integrated physiological system (Yan and Tinker, 2005; Yan et al., 2007). Two types of informatics tools are es￾sential to achieving such long term goals: tools for assembling historical crop performance trial data and tools for explor￾ing patterns within such data. The biplot analysis system (Yan et al., 2000, 2007; Yan, 2001; Yan and Rajcan, 2002; Yan and Kang, 2003; Yan and Tinker, 2005, 2006) has been de￾scribed as an effective tool for exploring historical crop per￾formance trial data. This paper describes a user-friendly crop information system (DUDE) that can facilitate the assembly and use of crop performance data. Tinker and Yan (2006) reviewed crop information systems that are currently available and described in detail an example of a relational database on which a crop information sys￾tem could be based (Fig. 1). This database structure is re￾ferred to as a Context Oriented Observation Library (COOL), which is the underlying relational database for DUDE. A COOL database has the following functionalities: (i) it ac￾commodates various types of data (numeric or nonnumeric) for a large1 number of variables (measured traits, genetic markers, treatment factors) for a large number of genotypes, from a large number of studies, years, and locations; (ii) it provides a mechanism for unifying the formats and termi￾nologies of variables, genotypes, and locations, which are typically heterogeneous among different data sources; and (iii) it simplifies the construction and execution of queries for any data subset, thereby facilitating data mining. More de￾tailed description on the COOL database structure can be found in Tinker and Yan (2006). The greatest difficulty in the use of COOL also derives from its greatest advantages, namely, its serialized data struc￾ture and its division of different data types into separate, linked tables. First, it is a difficult task to convert conventional spreadsheet data, which usually has identifiers, traits, and other measurements in parallel columns, to a COOL database, while keeping the context information properly preserved. Although this can be done manually, it involves multiple steps and is error-prone even for highly trained workers. Second, although the serialized data structure of COOL facilitates querying subsets of the data, queried data must be reformatted to parallel data before they can be used in some types of data analysis. Again, although this can be done manually, it is te￾dious and prone to error. The DUDE application was devel￾oped to solve both of these issues, thereby providing an integrated crop information system. DUDE simplifies the task of data import through the use of wizards (i) to convert data from spreadsheet format to a COOL database, (ii) to edit various tables of the database, (iii) to unify trait (variable), genotype, and location names within the database, (iv) to query subsets of data from a COOL database, (v) to save que￾ried data into required formats for data analyses, and (vi) to make corrections in the database. The DUDE application was described briefly by Tinker and Yan (2006). This paper will provide a more detailed description on an updated version of DUDE, and will formally introduce its availability to the re￾search community. VIEWING AND EDITING A COOL DATABASE When DUDE is executed, an opening panel offers two op￾tions: converting data from spreadsheets to a COOL database (Populate) or extracting data from a COOL database (Query). When the Populate button is clicked, an open file dialog will appear, asking for a Microsoft Access database with a COOL data structure, to which DUDE will be attached. When a valid COOL database is selected, an interface similar to Fig. 2 will be displayed, which provides options for viewing and editing each of the database tables and for populating the database. Each of the eight tables in the COOL database (Fig. 1) can be viewed by clicking the appropriate button on the top-left of the window (Fig. 2). If the Edit button is clicked and a table is selected, that table will be displayed in the upper left area of Fig. 2 and each field of the current record (row) in that table will be displayed in area on the right. This interface allows the table to be modified from either the upper-left area or the area on the right. One important use of the Edit mode is that heterogeneous trait, variety, and location names can be unified to facilitate data query and analysis (detailed in the data uni￾fication section below). A COOL database can also be directly edited through Microsoft Access. However, the user should be careful in deleting/adding fields of a table, as this may affect the integrity of the database and its compatibility with DUDE. It is possible to use DUDE to manage a COOL database with￾out opening the database using MS Access. 1 The current implementation of DUDE uses a COOL database in Microsoft Access file format. The theoretical limit on number of records in a MS Access table is over 2 billion. However, the size of a database is limited to 2 GB, so it is not possible to specify the maximum number of records in a given table. Eastern Cereal and Oilseed Research Centre (ECORC), Agriculture and Agri-Food Canada (AAFC), K.W. Neatby Building, 960 Carling Ave., Ottawa, ON, Canada, K1A 0C6. ECORC Contribution No. 06- 735. Received 10 Oct. 2006. *Corresponding author (yanw@agr.gc.ca). Published in Agron. J. 99:1029–1033 (2007). Software doi:10.2134/agronj2006.0280 ª American Society of Agronomy 677 S. Segoe Rd., Madison, WI 53711 USA Abbreviations: COOL, context oriented observation library; DUDE, data unification and distillation engine. Reproduced from Agronomy Journal. Published by American Society of Agronomy. All copyrights reserved. 1029 Published online June 5, 2007
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有