There are two ways of doing text-based searches of your data, as described in this section:
• Quick-searchdirectly from the search field in theNavigation Area.
• Advanced searchwhich makes it easy to make more specific searches.
In most cases, quick-search will find what you need, but if you need to be more specific in your search criteria, the advanced search is preferable.
4.4.1 What kind of information can be searched?
Below is a list of the different kinds of information that you can search for (applies to both quick-search and the advanced search).
• Name. The name of a sequence, an alignment or any other kind of element. The name is what is displayed in theNavigation Areaper default.
• Length.The length of the sequence.
• Organism. Sequences which contain information about organism can be searched. In this way, you could search for e.g.Homo sapienssequences.
• Custom attributes.Read more in section4.2
Only the first item in the list,Name, is available for all kinds of data. The rest is only relevant for sequences.
If you wish to perform a search for sequence similarity, use Local BLAST (see section 13.1.3) instead.
4.4.2 Quick search
At the bottom of theNavigation Areathere is a text field as shown in figure 4.15).
Figure 4.15:Search simply by typing in the text field and press Enter.
To search, simply enter a text to search for and pressEnter.
Quick search results
To show the results, the search pane is expanded as shown in figure4.16).
Figure 4.16:Search results.
If there are many hits, only the 50 first hits are immediately shown. At the bottom of the pane you can clickNext( ) to see the next 50 hits (see figure4.17).
If a search gives no hits, you will be asked if you wish to search for matches that start with your search term. If you accept this, an asterisk (*) will be appended to the search term.
Pressing the Alt key while you click a search result will high-light the search hit in its folder in the Navigation Area.
In the preferences (see5), you can specify the number of hits to be shown.
Figure 4.17:Page two of the search results.
Special search expressions
When you write a search term in the search field, you can get help to write a more advanced search expression by pressingShift+F1. This will reveal a list of guides as shown in figure 4.18.
Figure 4.18:Guides to help create advanced search expressions.
You can select any of the guides (using mouse or keyboard arrows), and start typing. If you e.g.
wish to search for sequences named BRCA1, select "Name search (name:)", and type "BRCA1".
Your search expression will now look like this: "name:BRCA1".
The guides available are these:
• Wildcard search (*). Appending an asterisk * to the search term will find matches starting with the term. E.g. searching for "brca*" will find bothbrca1andbrca2.
• Search related words ( ). If you don't know the exact spelling of a word, you can append a question mark to the search term. E.g. "brac1*" will find sequences with abrca1 gene.
• Include both terms (AND). If you write two search terms, you can define if your results have to match both search terms by combining them with AND. E.g. search for "brca1 AND human" will find sequences wherebothterms are present.
• Include either term (OR). If you write two search terms, you can define that your results have to match either of the search terms by combining them with OR. E.g. search for "brca1 OR brca2" will find sequences whereeither of the terms is present.
• Name search (name:). Search only the name of element.
• Organism search (organism:). For sequences, you can specify the organism to search for. This will look in the "Latin name" field which is seen in the Sequence Info view (see section11.4).
• Length search (length:[START TO END]). Search for sequences of a specific length. E.g.
search for sequences between 1000 and 2000 residues: "length:1000 TO 2000".
If you do not use this special syntax, you will automatically search for both name, description, organism, etc., and search terms will be combined as if you had put OR between them.
Quick search history
You can access the 10 most recent searches by clicking the icon ( ) next to the search field (see figure4.19).
Figure 4.19:Recent searches.
Clicking one of the recent searches will conduct the search again.
4.4.3 Advanced search
As a supplement to the Quick search described in the previous section you can use the more advanced search:
Edit|Local Search ( ) or Ctrl + F ( + F on Mac)
This will open the search view as shown in figure4.20
The first thing you can choose is which location should be searched. All the active locations are shown in this list. You can also choose to search all locations. Read more about locations in section4.1.1.
Furthermore, you can specify what kind of elements should be searched:
• All sequences
• Nucleotide sequences
• Protein sequences
• All data
Figure 4.20:Advanced search.
When searching for sequences, you will also get alignments, sequence lists etc as result, if they contain a sequence which match the search criteria.
Below are the search criteria. First, select a relevant search filter in the Add filter: list. For sequences you can search for
• Name
• Length
• Organism
See section4.4.2for more information on individual search terms.
For all other data, you can only search for name.
If you useAny field, it will search all of the above plus the following:
• Description
• Keywords
• Common name
• Taxonomy name
To see this information for a sequence, switch to theElement Info ( ) view (see section11.4).
For each search line, you can choose if you want the exact term by selecting "is equal to" or if you only enter the start of the term you wish to find (select "begins with").
An example is shown in figure4.21.
This example will find human nucleotide sequences (organism isHomo sapiens), and it will only find sequences shorter than 10,000 nucleotides.
Figure 4.21:Searching for human sequences shorter than 10,000 nucleotides.
Note that a search can be saved ( ) for later use. You do not save the search results - only the search parameters. This means that you can easily conduct the same search later on when your data has changed.
4.4.4 Search index
This section has a technical focus and is not relevant if your search works fine.
However, if you experience problems with your search results: if you do not get the hits you expect, it might be because of an index error.
TheCLC Genomics Workbenchautomatically maintains an index of all data in all locations in the Navigation Area. If this index becomes out of sync with the data, you will experience problems with strange results. In this case, you can rebuild the index:
Right-click the relevant location|Location|Rebuild Index
This will take a while depending on the size of your data. At any time, the process can be stopped in the process area, see section3.3.1.
User preferences and settings
Contents