aboutsummaryrefslogtreecommitdiff
path: root/src/pch
diff options
context:
space:
mode:
authorJade Lovelace <lix@jade.fyi>2024-08-23 16:33:48 -0700
committerRebecca Turner <rbt@sent.as>2024-08-28 09:55:05 -0700
commit04f8a1483362212f33f97c13472ded4121a18a86 (patch)
tree58b6644aa78be920b93a23244ea8663c4578256a /src/pch
parente6f2af06e6d7cc01e142c21259635094f73de104 (diff)
tree-wide: shuffle headers around for about 30s compile time
This didn't really feel so worth it afterwards, but I did untangle a bunch of stuff that should not have been tangled. The general gist of this change is that variant bullshit was causing a bunch of compile time, and it seems like the only way to deal with variant induced compile time is to keep variant types out of headers. Explicit template instantiation seems to do nothing for them. I also seem to have gotten some back-end time improvement from explicitly instantiating regex, but I don't know why. There is no corresponding front-end time improvement from it: regex is still at the top of the sinners list. **** Templates that took longest to instantiate: 15231 ms: std::basic_regex<char>::_M_compile (28 times, avg 543 ms) 15066 ms: std::__detail::_Compiler<std::regex_traits<char>>::_Compiler (28 times, avg 538 ms) 12571 ms: std::__detail::_Compiler<std::regex_traits<char>>::_M_disjunction (28 times, avg 448 ms) 12454 ms: std::__detail::_Compiler<std::regex_traits<char>>::_M_alternative (28 times, avg 444 ms) 12225 ms: std::__detail::_Compiler<std::regex_traits<char>>::_M_term (28 times, avg 436 ms) 11363 ms: nlohmann::basic_json<>::parse<const char *> (21 times, avg 541 ms) 10628 ms: nlohmann::basic_json<>::basic_json (109 times, avg 97 ms) 10134 ms: std::__detail::_Compiler<std::regex_traits<char>>::_M_atom (28 times, avg 361 ms) Back-end time before messing with the regex: **** Function sets that took longest to compile / optimize: 8076 ms: void boost::io::detail::put<$>(boost::io::detail::put_holder<$> cons... (177 times, avg 45 ms) 4382 ms: std::_Rb_tree<$>::_M_erase(std::_Rb_tree_node<$>*) (1247 times, avg 3 ms) 3137 ms: boost::stacktrace::detail::to_string_impl_base<boost::stacktrace::de... (137 times, avg 22 ms) 2896 ms: void boost::io::detail::mk_str<$>(std::__cxx11::basic_string<$>&, ch... (177 times, avg 16 ms) 2304 ms: std::_Rb_tree<$>::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_... (210 times, avg 10 ms) 2116 ms: bool std::__detail::_Compiler<$>::_M_expression_term<$>(std::__detai... (112 times, avg 18 ms) 2051 ms: std::_Rb_tree_iterator<$> std::_Rb_tree<$>::_M_emplace_hint_unique<$... (244 times, avg 8 ms) 2037 ms: toml::result<$> toml::detail::sequence<$>::invoke<$>(toml::detail::l... (93 times, avg 21 ms) 1928 ms: std::__detail::_Compiler<$>::_M_quantifier() (28 times, avg 68 ms) 1859 ms: nlohmann::json_abi_v3_11_3::detail::serializer<$>::dump(nlohmann::js... (41 times, avg 45 ms) 1824 ms: std::_Function_handler<$>::_M_manager(std::_Any_data&, std::_Any_dat... (973 times, avg 1 ms) 1810 ms: std::__detail::_BracketMatcher<$>::_BracketMatcher(std::__detail::_B... (112 times, avg 16 ms) 1793 ms: nix::fetchers::GitInputScheme::fetch(nix::ref<$>, nix::fetchers::Inp... (1 times, avg 1793 ms) 1759 ms: std::_Rb_tree<$>::_M_get_insert_unique_pos(std::__cxx11::basic_strin... (281 times, avg 6 ms) 1722 ms: bool nlohmann::json_abi_v3_11_3::detail::parser<$>::sax_parse_intern... (19 times, avg 90 ms) 1677 ms: boost::io::basic_altstringbuf<$>::overflow(int) (194 times, avg 8 ms) 1674 ms: std::__cxx11::basic_string<$>::_M_mutate(unsigned long, unsigned lon... (249 times, avg 6 ms) 1660 ms: std::_Rb_tree_node<$>* std::_Rb_tree<$>::_M_copy<$>(std::_Rb_tree_no... (304 times, avg 5 ms) 1599 ms: bool nlohmann::json_abi_v3_11_3::detail::parser<$>::sax_parse_intern... (19 times, avg 84 ms) 1568 ms: void std::__detail::_Compiler<$>::_M_insert_bracket_matcher<$>(bool) (112 times, avg 14 ms) 1541 ms: std::__shared_ptr<$>::~__shared_ptr() (531 times, avg 2 ms) 1539 ms: nlohmann::json_abi_v3_11_3::detail::serializer<$>::dump_escaped(std:... (41 times, avg 37 ms) 1471 ms: void std::__detail::_Compiler<$>::_M_insert_character_class_matcher<... (112 times, avg 13 ms) After messing with the regex (notice std::__detail::_Compiler vanishes here, but I don't know why): **** Function sets that took longest to compile / optimize: 8054 ms: void boost::io::detail::put<$>(boost::io::detail::put_holder<$> cons... (177 times, avg 45 ms) 4313 ms: std::_Rb_tree<$>::_M_erase(std::_Rb_tree_node<$>*) (1217 times, avg 3 ms) 3259 ms: boost::stacktrace::detail::to_string_impl_base<boost::stacktrace::de... (137 times, avg 23 ms) 3045 ms: void boost::io::detail::mk_str<$>(std::__cxx11::basic_string<$>&, ch... (177 times, avg 17 ms) 2314 ms: std::_Rb_tree<$>::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_... (207 times, avg 11 ms) 1923 ms: std::_Rb_tree_iterator<$> std::_Rb_tree<$>::_M_emplace_hint_unique<$... (216 times, avg 8 ms) 1817 ms: bool nlohmann::json_abi_v3_11_3::detail::parser<$>::sax_parse_intern... (18 times, avg 100 ms) 1816 ms: toml::result<$> toml::detail::sequence<$>::invoke<$>(toml::detail::l... (93 times, avg 19 ms) 1788 ms: nlohmann::json_abi_v3_11_3::detail::serializer<$>::dump(nlohmann::js... (40 times, avg 44 ms) 1749 ms: std::_Rb_tree<$>::_M_get_insert_unique_pos(std::__cxx11::basic_strin... (278 times, avg 6 ms) 1724 ms: std::__cxx11::basic_string<$>::_M_mutate(unsigned long, unsigned lon... (248 times, avg 6 ms) 1697 ms: boost::io::basic_altstringbuf<$>::overflow(int) (194 times, avg 8 ms) 1684 ms: nix::fetchers::GitInputScheme::fetch(nix::ref<$>, nix::fetchers::Inp... (1 times, avg 1684 ms) 1680 ms: std::_Rb_tree_node<$>* std::_Rb_tree<$>::_M_copy<$>(std::_Rb_tree_no... (303 times, avg 5 ms) 1589 ms: bool nlohmann::json_abi_v3_11_3::detail::parser<$>::sax_parse_intern... (18 times, avg 88 ms) 1483 ms: non-virtual thunk to boost::wrapexcept<$>::~wrapexcept() (181 times, avg 8 ms) 1447 ms: nlohmann::json_abi_v3_11_3::detail::serializer<$>::dump_escaped(std:... (40 times, avg 36 ms) 1441 ms: std::__shared_ptr<$>::~__shared_ptr() (496 times, avg 2 ms) 1420 ms: boost::stacktrace::basic_stacktrace<$>::init(unsigned long, unsigned... (137 times, avg 10 ms) 1396 ms: boost::basic_format<$>::~basic_format() (194 times, avg 7 ms) 1290 ms: std::__cxx11::basic_string<$>::_M_replace_cold(char*, unsigned long,... (231 times, avg 5 ms) 1258 ms: std::vector<$>::~vector() (354 times, avg 3 ms) 1222 ms: std::__cxx11::basic_string<$>::_M_replace(unsigned long, unsigned lo... (231 times, avg 5 ms) 1194 ms: std::_Rb_tree<$>::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_... (49 times, avg 24 ms) 1186 ms: bool tao::pegtl::internal::sor<$>::match<$>(std::integer_sequence<$>... (1 times, avg 1186 ms) 1149 ms: std::__detail::_Executor<$>::_M_dfs(std::__detail::_Executor<$>::_Ma... (70 times, avg 16 ms) 1123 ms: toml::detail::sequence<$>::invoke(toml::detail::location&) (69 times, avg 16 ms) 1110 ms: nlohmann::json_abi_v3_11_3::basic_json<$>::json_value::destroy(nlohm... (55 times, avg 20 ms) 1079 ms: std::_Function_handler<$>::_M_manager(std::_Any_data&, std::_Any_dat... (541 times, avg 1 ms) 1033 ms: nlohmann::json_abi_v3_11_3::detail::lexer<$>::scan_number() (20 times, avg 51 ms) Change-Id: I10af282bcd4fc39c2d3caae3453e599e4639c70b
Diffstat (limited to 'src/pch')
-rw-r--r--src/pch/precompiled-headers.hh16
1 files changed, 16 insertions, 0 deletions
diff --git a/src/pch/precompiled-headers.hh b/src/pch/precompiled-headers.hh
index f52f1cab8..c417d27db 100644
--- a/src/pch/precompiled-headers.hh
+++ b/src/pch/precompiled-headers.hh
@@ -58,3 +58,19 @@
#include <unistd.h>
#include <nlohmann/json.hpp>
+
+// This stuff is here to force the compiler to actually apply the extern
+// template directives in all compilation units. To borrow a term, under
+// complex microarchitectural conditions, clang ignores the extern template
+// declaration, as revealed in the profile.
+//
+// In most cases, extern template works fine in the header itself. We don't
+// have any idea why this happens.
+
+// Here because of all the regexes everywhere (it is infeasible to block instantiation everywhere)
+// For some reason this does not actually prevent the instantiation of
+// regex::_M_compile, and the regex compiler (my interpretation of what this is
+// supposed to do is make the template bits out-of-line), but it *does* prevent
+// a bunch of codegen of regex stuff, which seems to save about 30s on-cpu.
+// Instantiated in libutil/regex.cc.
+extern template class std::basic_regex<char>;