-
-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
GeoIP2 trys to write characters outside of ISO-8859 #127
Comments
Forgot about loading: Of course we also need to unescape the characters on loading/updating the data file. So e.g. in |
Here is a better approach without the need of decoding and HTML::Entities as extra dependency. We simply store the escaped name in the hash: --- b/wwwroot/cgi-bin/plugins/geoip2_city.pm.orig 2019-04-19 12:36:02.026552366 +0200
+++ a/wwwroot/cgi-bin/plugins/geoip2_city.pm 2019-04-19 12:35:52.906632209 +0200
@@ -29,6 +29,11 @@
#use strict;
no strict "refs";
+sub html_escape {
+ my $s = shift;
+ $s =~ s/([^\x00-\x7F])/sprintf "&#x%X;", ord($1)/ge;
+ return $s;
+}
#-----------------------------------------------------------------------------
@@ -369,7 +374,7 @@
my $countrycity=$record->country()->iso_code().'_'.$record->city()->name();
$countrycity=~s/ /%20/g;
if ($record->most_specific_subdivision()->name()) { $countrycity.='_'.$record->most_specific_subdivision()->name(); }
- $_city_h{lc($countrycity)}++;
+ $_city_h{html_escape(lc($countrycity))}++;
} else {
$_city_h{'unknown'}++;
}
@@ -409,7 +414,7 @@
my $countrycity=($record->country()->iso_code()).'_'.$city;
$countrycity=~s/ /%20/g;
if ($record->most_specific_subdivision()->name()) { $countrycity.='_'.$record->most_specific_subdivision()->name(); }
- $_city_h{lc($countrycity)}++;
+ $_city_h{html_escape(lc($countrycity))}++;
} else {
$_city_h{'unknown'}++;
} |
manuelm
added a commit
to manuelm/awstats
that referenced
this issue
Apr 26, 2019
* Correctly convert dns names to ip4 and ip6 address using getaddrinfo (fixes eldy#120, eldy#121, obsoletes eldy#115) * Only lookup if the IP is of type public (fixes eldy#122) and catch further lookup errors (obsoletes eldy#123) * Store and display the GeoIP City output HTML escaped (fixes eldy#127) * Code to perform and cache the actual lookup is consolidated * General code improvement and readability
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
This is related to #69 but targets GeoIP2 modules. Mainly GeoIP2::City. GeoIP2 names are UTF-8 encoded. Therefor they can contain characters outside of the ISO-8859 range. Since AWStats data files are stored in ISO the GeoIP output has to be escaped. As in #69 I suggest html escaping them, either using
HTML::Entities
or a stripped down version which only escapes high bytes:Regarding
geoip2_city.pm
the function call should be added inSectionWriteHistory_geoip2_city
so thatXMLEncodeForHisto($_)
becomesXMLEncodeForHisto(html_escape($_))
(orXMLEncodeForHisto(encode_entities_numeric($_))
).For
geoip2.pm
this isn't strictly necessary as I'm not aware of any international countries names that contain characters outside the ISO range.195.38.12.0
is an ipaddress for testing. GeoIP2::City tries to storePL_Zamość_Lublin
. With escaping this gets converted toPL_Zamość_Lublin
.The text was updated successfully, but these errors were encountered: