Introduce lookup_name_info and generalize Ada's FULL/WILD name matching
authorPedro Alves <palves@redhat.com>
Wed, 8 Nov 2017 14:22:32 +0000 (14:22 +0000)
committerPedro Alves <palves@redhat.com>
Wed, 8 Nov 2017 16:02:24 +0000 (16:02 +0000)
commitb5ec771e60c1a0863e51eb491c85c674097e9e13
treee440aff8ba47188b8e144be1374487b8b110204f
parent5ffa0793690b42b2a0c1c21dbb5e64634e58fa00
Introduce lookup_name_info and generalize Ada's FULL/WILD name matching

Summary:
 - This is preparation for supporting wild name matching on C++ too.
 - This is also preparation for TAB-completion fixes.
 - Makes symbol name matching (think strcmp_iw) be based on a per-language method.
 - Merges completion and non-completion name comparison (think
   language_ops::la_get_symbol_name_cmp generalized).
 - Avoid re-hashing lookup name multiple times
 - Centralizes preparing a name for lookup (Ada name encoding / C++ Demangling),
   both completion and non-completion.
 - Fixes Ada latent bug with verbatim name matches in expressions
 - Makes ada-lang.c use common|symtab.c completion code a bit more.

Ada's wild matching basically means that

 "(gdb) break foo"

will find all methods named "foo" in all packages.  Translating to
C++, it's roughly the same as saying that "break klass::method" sets
breakpoints on all "klass::method" methods of all classes, no matter
the namespace.  A following patch will teach GDB about fullname vs
wild matching for C++ too.  This patch is preparatory work to get
there.

Another idea here is to do symbol name matching based on the symbol
language's algorithm.  I.e., avoid dependency on current language set.

This allows for example doing

  (gdb) b foo::bar< int > (<tab>

and having gdb name match the C++ symbols correctly even if the
current language is C or Assembly (or Rust, or Ada, or ...), which can
easily happen if you step into an Assembly/C runtime library frame.

By encapsulating all the information related to a lookup name in a
class, we can also cache hash computation for a given language in the
lookup name object, to avoid recomputing it over and over.

Similarly, because we don't really know upfront which languages the
lookup name will be matched against, for each language we store the
lookup name transformed into a search name.  E.g., for C++, that means
demangling the name.  But for Ada, it means encoding the name.  This
actually forces us to centralize all the different lookup name
encoding in a central place, resulting in clearer code, IMO.  See
e.g., the new ada_lookup_name_info class.

The lookup name -> symbol search name computation is also done only
once per language.

The old language->la_get_symbol_name_cmp / symbol_name_cmp_ftype are
generalized to work with both completion, and normal symbol look up.

At some point early on, I had separate completion vs non-completion
language vector entry points, but a single method ends up being better
IMO for simplifying things -- the more we merge the completion /
non-completion name lookup code paths, the less changes for bugs
causing completion vs normal lookup finding different symbols.

The ada-lex.l change is necessary because when doing

  (gdb) p <UpperCase>

then the name that is passed to write_ write_var_or_type ->
ada_lookup_symbol_list misses the "<>", i.e., it's just "UpperCase",
and we end up doing a wild match against "UpperCase" lowercased by
ada_lookup_name_info's constructor.  I.e., "uppercase" wouldn't ever
match "UpperCase", and the symbol lookup fails.

This wouldn't cause any regression in the testsuite, but I added a new
test that would pass before the patch and fail after, if it weren't
for that fix.

This is latent bug that happens to go unnoticed because that
particular path was inconsistent with the rest of Ada symbol lookup by
not lowercasing the lookup name.

Ada's symbol_completion_add is deleted, replaced by using common
code's completion_list_add_name.  To make the latter work for Ada, we
needed to add a new output parameter, because Ada wants to return back
a custom completion candidates that are not the symbol name.

With this patch, minimal symbol demangled name hashing is made
consistent with regular symbol hashing.  I.e., it now goes via the
language vector's search_name_hash method too, as I had suggested in a
previous patch.

dw2_expand_symtabs_matching / .gdb_index symbol names were a
challenge.  The problem is that we have no way to telling what is the
language of each symbol name found in the index, until we expand the
corresponding full symbol, which is off course what we're trying to
avoid.  Language information is simply not considered in the index
format...  Since the symbol name hashing and comparison routines are
per-language, we now have a problem.  The patch sorts this out by
matching each name against all languages.  This is inneficient, and
indeed slows down completion several times.  E.g., with:

 $ cat script.cmd
 set pagination off
 set $count = 0
 while $count < 400
   complete b string_prin
   printf "count = %d\n", $count
   set $count = $count + 1
 end

 $ time gdb --batch -q ./gdb-with-index -ex "source script-string_printf.cmd"

I get, before patch (-O2, x86-64):

 real    0m1.773s
 user    0m1.737s
 sys     0m0.040s

While after patch (-O2, x86-64):

 real    0m9.843s
 user    0m9.482s
 sys     0m0.034s

However, the following patch will optimize this, and will actually
make this use case faster compared to the "before patch" above:

 real    0m1.321s
 user    0m1.285s
 sys     0m0.039s

gdb/ChangeLog:
2017-11-08   Pedro Alves  <palves@redhat.com>

* ada-lang.c (ada_encode): Rename to ..
(ada_encode_1): ... this.  Add throw_errors parameter and handle
it.
(ada_encode): Reimplement.
(match_name): Delete, folded into full_name.
(resolve_subexp): No longer pass the encoded name to
ada_lookup_symbol_list.
(should_use_wild_match): Delete.
(name_match_type_from_name): New.
(ada_lookup_simple_minsym): Use lookup_name_info and the
language's symbol_name_matcher_ftype.
(add_symbols_from_enclosing_procs, ada_add_local_symbols)
(ada_add_block_renamings): Adjust to use lookup_name_info.
(ada_lookup_name): New.
(add_nonlocal_symbols, ada_add_all_symbols)
(ada_lookup_symbol_list_worker, ada_lookup_symbol_list)
(ada_iterate_over_symbols): Adjust to use lookup_name_info.
(ada_name_for_lookup): Delete.
(ada_lookup_encoded_symbol): Construct a verbatim name.
(wild_match): Reverse sense of return type.  Use bool.
(full_match): Reverse sense of return type.  Inline bits of old
match_name here.
(ada_add_block_symbols): Adjust to use lookup_name_info.
(symbol_completion_match): Delete, folded into...
(ada_lookup_name_info::matches): ... .this new method.
(symbol_completion_add): Delete.
(ada_collect_symbol_completion_matches): Add name_match_type
parameter.  Adjust to use lookup_name_info and
completion_list_add_name.
(get_var_value, ada_add_global_exceptions): Adjust to use
lookup_name_info.
(ada_get_symbol_name_cmp): Delete.
(do_wild_match, do_full_match): New functions.
(ada_lookup_name_info::ada_lookup_name_info): New method.
(ada_symbol_name_matches, ada_get_symbol_name_matcher): New
functions.
(ada_language_defn): Install ada_get_symbol_name_matcher.
* ada-lex.l (processId): If name starts with '<', copy it
verbatim.
* block.c (block_iter_match_step, block_iter_match_first)
(block_iter_match_next, block_lookup_symbol)
(block_lookup_symbol_primary, block_find_symbol): Adjust to use
lookup_name_info.
* block.h (block_iter_match_first, block_iter_match_next)
(ALL_BLOCK_SYMBOLS_WITH_NAME): Adjust to use lookup_name_info.
* c-lang.c (c_language_defn, cplus_language_defn)
(asm_language_defn, minimal_language_defn): Adjust comments to
refer to la_get_symbol_name_matcher.
* completer.c (complete_files_symbols)
(collect_explicit_location_matches, symbol_completer): Pass a
symbol_name_match_type down.
* completer.h (class completion_match, completion_match_result):
New classes.
(completion_tracker::reset_completion_match_result): New method.
(completion_tracker::m_completion_match_result): New field.
* cp-support.c (make_symbol_overload_list_block): Adjust to use
lookup_name_info.
(cp_fq_symbol_name_matches, cp_get_symbol_name_matcher): New
functions.
* cp-support.h (cp_get_symbol_name_matcher): New declaration.
* d-lang.c: Adjust comments to refer to
la_get_symbol_name_matcher.
* dictionary.c (dict_vector) <iter_match_first, iter_match_next>:
Adjust to use lookup_name_info.
(dict_iter_match_first, dict_iter_match_next)
(iter_match_first_hashed, iter_match_next_hashed)
(iter_match_first_linear, iter_match_next_linear): Adjust to work
with a lookup_name_info.
* dictionary.h (dict_iter_match_first, dict_iter_match_next):
Likewise.
* dwarf2read.c (dw2_lookup_symbol): Adjust to use lookup_name_info.
(dw2_map_matching_symbols): Adjust to use symbol_name_match_type.
(gdb_index_symbol_name_matcher): New class.
(dw2_expand_symtabs_matching) Adjust to use lookup_name_info and
gdb_index_symbol_name_matcher.  Accept a NULL symbol_matcher.
* f-lang.c (f_collect_symbol_completion_matches): Adjust to work
with a symbol_name_match_type.
(f_language_defn): Adjust comments to refer to
la_get_symbol_name_matcher.
* go-lang.c (go_language_defn): Adjust comments to refer to
la_get_symbol_name_matcher.
* language.c (default_symbol_name_matcher)
(language_get_symbol_name_matcher): New functions.
(unknown_language_defn, auto_language_defn): Adjust comments to
refer to la_get_symbol_name_matcher.
* language.h (symbol_name_cmp_ftype): Delete.
(language_defn) <la_collect_symbol_completion_matches>: Add match
type parameter.
<la_get_symbol_name_cmp>: Delete field.
<la_get_symbol_name_matcher>: New field.
<la_iterate_over_symbols>: Adjust to use lookup_name_info.
(default_symbol_name_matcher, language_get_symbol_name_matcher):
Declare.
* linespec.c (iterate_over_all_matching_symtabs)
(iterate_over_file_blocks): Adjust to use lookup_name_info.
(find_methods): Add language parameter, and use lookup_name_info
and the language's symbol_name_matcher_ftype.
(linespec_complete_function): Adjust.
(lookup_prefix_sym): Use lookup_name_info.
(add_all_symbol_names_from_pspace): Adjust.
(find_superclass_methods): Add language parameter and pass it
down.
(find_method): Pass symbol language down.
(find_linespec_symbols): Don't demangle or Ada encode here.
(search_minsyms_for_name): Add lookup_name_info parameter.
(add_matching_symbols_to_info): Add name_match_type parameter.
Use lookup_name_info.
* m2-lang.c (m2_language_defn): Adjust comments to refer to
la_get_symbol_name_matcher.
* minsyms.c: Include <algorithm>.
(add_minsym_to_demangled_hash_table): Remove table parameter and
add objfile parameter.  Use search_name_hash, and add language to
demangled languages vector.
(struct found_minimal_symbols): New struct.
(lookup_minimal_symbol_mangled, lookup_minimal_symbol_demangled):
New functions.
(lookup_minimal_symbol): Adjust to use them.  Don't canonicalize
input names here.  Use lookup_name_info instead.  Lookup up
demangled names once for each language in the demangled names
vector.
(iterate_over_minimal_symbols): Use lookup_name_info.  Lookup up
demangled names once for each language in the demangled names
vector.
(build_minimal_symbol_hash_tables): Adjust.
* minsyms.h (iterate_over_minimal_symbols): Adjust to pass down a
lookup_name_info.
* objc-lang.c (objc_language_defn): Adjust comment to refer to
la_get_symbol_name_matcher.
* objfiles.h: Include <vector>.
(objfile_per_bfd_storage) <demangled_hash_languages>: New field.
* opencl-lang.c (opencl_language_defn): Adjust comment to refer to
la_get_symbol_name_matcher.
* p-lang.c (pascal_language_defn): Adjust comment to refer to
la_get_symbol_name_matcher.
* psymtab.c (psym_lookup_symbol): Use lookup_name_info.
(match_partial_symbol): Use symbol_name_match_type,
lookup_name_info and psymbol_name_matches.
(lookup_partial_symbol): Use lookup_name_info.
(map_block): Use symbol_name_match_type and lookup_name_info.
(psym_map_matching_symbols): Use symbol_name_match_type.
(psymbol_name_matches): New.
(recursively_search_psymtabs): Use lookup_name_info and
psymbol_name_matches.  Rename 'kind' parameter to 'domain'.
(psym_expand_symtabs_matching): Use lookup_name_info.  Rename
'kind' parameter to 'domain'.
* rust-lang.c (rust_language_defn): Adjust comment to refer to
la_get_symbol_name_matcher.
* symfile-debug.c (debug_qf_map_matching_symbols)
(debug_qf_map_matching_symbols): Use symbol_name_match_type.
(debug_qf_expand_symtabs_matching): Use lookup_name_info.
* symfile.c (expand_symtabs_matching): Use lookup_name_info.
* symfile.h (quick_symbol_functions) <map_matching_symbols>:
Adjust to use symbol_name_match_type.
<expand_symtabs_matching>: Adjust to use lookup_name_info.
(expand_symtabs_matching): Adjust to use lookup_name_info.
* symmisc.c (maintenance_expand_symtabs): Use
lookup_name_info::match_any ().
* symtab.c (symbol_matches_search_name): New.
(eq_symbol_entry): Adjust to use lookup_name_info and the
language's matcher.
(demangle_for_lookup_info::demangle_for_lookup_info): New.
(lookup_name_info::match_any): New.
(iterate_over_symbols, search_symbols): Use lookup_name_info.
(compare_symbol_name): Add language, lookup_name_info and
completion_match_result parameters, and use them.
(completion_list_add_name): Make extern.  Add language and
lookup_name_info parameters.  Use them.
(completion_list_add_symbol, completion_list_add_msymbol)
(completion_list_objc_symbol): Add lookup_name_info parameters and
adjust.  Pass down language.
(completion_list_add_fields): Add lookup_name_info parameters and
adjust.  Pass down language.
(add_symtab_completions): Add lookup_name_info parameters and
adjust.
(default_collect_symbol_completion_matches_break_on): Add
name_match_type parameter, and use it.  Use lookup_name_info.
(default_collect_symbol_completion_matches)
(collect_symbol_completion_matches): Add name_match_type
parameter, and pass it down.
(collect_symbol_completion_matches_type): Adjust.
(collect_file_symbol_completion_matches): Add name_match_type
parameter, and use lookup_name_info.
* symtab.h: Include <string> and "common/gdb_optional.h".
(enum class symbol_name_match_type): New.
(class ada_lookup_name_info): New.
(struct demangle_for_lookup_info): New.
(class lookup_name_info): New.
(symbol_name_matcher_ftype): New.
(SYMBOL_MATCHES_SEARCH_NAME): Use symbol_matches_search_name.
(symbol_matches_search_name): Declare.
(MSYMBOL_MATCHES_SEARCH_NAME): Delete.
(default_collect_symbol_completion_matches)
(collect_symbol_completion_matches)
(collect_file_symbol_completion_matches): Add name_match_type
parameter.
(iterate_over_symbols): Use lookup_name_info.
(completion_list_add_name): Declare.
* utils.c (enum class strncmp_iw_mode): Moved to utils.h.
(strncmp_iw_with_mode): Now extern.
* utils.h (enum class strncmp_iw_mode): Moved from utils.c.
(strncmp_iw_with_mode): Declare.

gdb/testsuite/ChangeLog:
2017-11-08   Pedro Alves  <palves@redhat.com>

* gdb.ada/complete.exp (p <Exported_Capitalized>): New test.
(p Exported_Capitalized): New test.
(p exported_capitalized): New test.
38 files changed:
gdb/ChangeLog
gdb/ada-lang.c
gdb/ada-lex.l
gdb/block.c
gdb/block.h
gdb/c-lang.c
gdb/completer.c
gdb/completer.h
gdb/cp-support.c
gdb/cp-support.h
gdb/d-lang.c
gdb/dictionary.c
gdb/dictionary.h
gdb/dwarf2read.c
gdb/f-lang.c
gdb/go-lang.c
gdb/language.c
gdb/language.h
gdb/linespec.c
gdb/m2-lang.c
gdb/minsyms.c
gdb/minsyms.h
gdb/objc-lang.c
gdb/objfiles.h
gdb/opencl-lang.c
gdb/p-lang.c
gdb/psymtab.c
gdb/rust-lang.c
gdb/symfile-debug.c
gdb/symfile.c
gdb/symfile.h
gdb/symmisc.c
gdb/symtab.c
gdb/symtab.h
gdb/testsuite/ChangeLog
gdb/testsuite/gdb.ada/complete.exp
gdb/utils.c
gdb/utils.h
This page took 0.030349 seconds and 4 git commands to generate.