SAP Knowledge Base Article - Public

2091005 - Recruiting Marketing: How Search Works

Symptom


The RMK keyword search does not return the expected results. Why?
What is the search logic in RMK?

Environment

SAP SuccessFactors Recruiting Marketing

Resolution

Keyword search is utilized in many forms across the Recruiting Marketing (RMK) platform in both the front-end career site as well as backend rules and processes. RMK keyword search is powered by an indexing and search process modeled after the large web search engines such as Google and uses a number of different types of logic to display jobs or sometimes other content based on relevance. The purpose of this document is to provide descriptions and basic functional overview regarding the search logic.

Fields searchable
A number of fields are searched. This includes many more than just the title and description fields and could explain why some results may appear unexpected. The list is as follows:

"title" 
"internaltitle"
"department"
"description"
"internaldescription" 
"segments"
"facility" 
"businessunit"
"companylocation" 
"location" 
"city"
"state" 
"statename" 
"country" 
"countryname" 
"marketmajor" 
"marketsecondary" 
"reqid"
"zip" 

Note that the description field that is searched is the description payload which includes code so a term could be found that is used in a URL for example as opposed to a visible term in the description page itself.

Basic Search Types

  • Single Term: A search using a single keyword example: ‘Sales’.
Description: When a single search term is used such as ‘Sales’, the database is searched for all jobs with the keyword ‘Sales’ and displays jobs in the order of relevancy.
  •  Multi Term Search: Two or more keywords are used to perform the search. Example, ‘Project Manager’.

Description: When two or more terms are used to perform the search, and ‘OR’ operator is invisibly placed between the two or more keywords.
The example ‘Project Manager’ all jobs with keyword ‘Project’ or ‘Manager’ will be returned in the results with the highest relevant scoring jobs appearing first on the list.

  •  Searching Without a Search Term: Performing a job search without entering any keyword search terms into the search box.

Description: When an open search without any keywords is performed, all active jobs are returned in the result set. Relevancy scoring
is still applied to determine the jobs order however, since there isn’t a keyword for the jobs to be scored against, they are returned in a seemingly random order.

  •  Search using “Quotations”: Two or more keyword terms wrapped in “Quotations” are treated as a single term.

Description: When keywords/phrases are wrapped with “Quotations” only jobs with the exact same set of keywords in the same order are returned.
For example the search “Project Manager” is treated as a single keyword search. Jobs will only be returned if the terms Project Manager appears next to each other somewhere within the job.
It is important to note that when using “quotations” to perform a search, stemming is still applied and stop words are removed (see stemming and stop words for more information).

  • Search using (Parentheses): When two or more keyword terms are wrapped with (Parentheses), the database is searched for jobs with either term.

Description: While it is not initially evident how this type of search is any different from a basic ‘Multi Term Search’, the use of (Parentheses) can be valuable when performing more advanced/complex search queries that involve specific individual data points.

  •  Search using a Boolean Operator: A search using a Boolean Operator such as AND/NOT/OR (in full caps) will further refine the search.

Description: An example of using a Boolean Operator would be placing AND/NOT (in full caps) in between two or more keyword terms. The search ‘project AND manager’ will return jobs only when both the keywords are found. Inversely to the search ‘project NOT manager’ will return jobs only when ‘project’ without ‘manager’ is found.Note: the keyword terms used do not need to be found next to each other in the job description.

  • Wildcard Characters (question marks, asterisks, and tilde) Wildcards are characters which can be used as search criteria to narrow or expand alternatives in the search results.
         Question Mark Character (?) – the ? question mark character can be used when searching for terms where a single character is not known, or is variable in the results desired. When the exact spelling of a name is not known, then the question mark character can be used to identify terms with any character in that location. 
         Multiple Character (*) – the * asterisk character can be used as a wildcard when searching for terms, where any number of the characters are not known. Using the asterisk will results in the greatest diversity of matches. When the search needs to include a wide variety of results, the asterisk is the best option. 
         Fuzzy searches (~) – to do a fuzzy logic search use the tilde ~ symbol at the end of a single term. Fuzzy logic provides assistive search logic to find similar spellings. Eg) Search for a term similar in spelling to roam use a tilde at the end of the word “roam~”. This search will find terms like foam, roam, roams, room, road, roads, etc.
 

​Search Factors 

Stemming: Stemming refers to the process of identifying the stem or root of the keyword search term and searching all variations of the word. Example: the search term ‘Fishing’ is used to perform a search.
The stem of the word ‘Fishing’ is ‘Fish’. Therefore, the search will include results for not only ‘Fishing’ but also ‘Fish’ and other variations of the word ‘Fish’ such as ‘Fished’ and ‘Fisher’. The stemming process is automatically applied to every English keyword search. However, this process does not apply to other languages.

Stop words: These are words that are excluded when used in a search string. True Stop words are not recognized in rules or queries and will be ignored. In the examples below many of the words are not ignored but will not produce expected results. This process is designed to increase search accuracy. Such words are present throughout all jobs in large quantities and are removed from the search string to avoid skewing a job’s relevancy. This process is built into the framework and cannot be circumvented.

a but got is  should they
also by  had it  so  this
an can has it's such to
and did have its than too
any do  how like that want
am  does i no  the was
are find i'm not their were
as  for if of then what
at  from in  on  there when
be  get into or  these where

The Case of "IT" :"it" is one of the stop words. This means that IT (standing for Information Technologies) cannot be used as a search term as it will not return any results.
As to the possibility of removing "IT" from the list, the risk is that every job with “it” in it would then hit the search request which would render the search useless. Even an upper case IT search would return unwanted results as it would show all jobs in Italy. In any case, the search could not be made case sensitive as this would be too restrictive.
What is more, changes to field data types and how they work has deep impacts in other searches for categories/rules etc. that had previously worked so this change would simply be too risky.
Our recommendation is that "IT" jobs should always be spelled out completely, using Information Technology instead.

Scoring: ​Scoring is the process of determining a jobs relevancy. When a keyword search is performed, every job in the database is scored based on its relevance to the keyword search term(s). The higher a job’s score, the more relevant it is deemed to be and the higher it will appear on the results list.

When we search using no search parameters, the results will be sorted by date.
When parameters are selected, the system will use the relevancy algorithm (scoring)

It is not possible to generate a report with all the keywords used by users in RMK site search.

Using custom JavaScript may impact the search results in various ways (missing jobs, no results, blank page).

Keywords

search syntax, how to search in RMK, IT jobs, Information Technology, RMK search logic, keyword, search box

, KBA , LOD-SF-RMK-COR , RMK Core Platform , How To

Product

SAP SuccessFactors Recruiting all versions