This is the Unreliable/Predatory Source Detector (UPSD), a
user script that identifies various unreliable and potentially unreliable sources. This is not a tool to be mindlessly used.
For example, Twitter is generally unreliable. If Twitter is used in an article, the script will only tell you that a generally unreliable source was used. It does not say that Twitter was used inappropriately, or that it shouldn't be used for that information. The script cannot tell the difference between a tweet by a random person or one by NASA. Questions, comments and requests can be made on the talk page. |
“ |
|
” |
Description | Easily detects unreliable and potentially unreliable sourcing |
---|---|
Author(s) | creffett, Headbomb, Jorm, SD0001 |
Maintainer(s) | Headbomb |
Status | WP:TOPSCRIPTS #12 |
Updated | July 29, 2024 (2 days ago) |
Source | User:Headbomb/unreliable.js |
importScript( 'User:Headbomb/unreliable.js' ); // Backlink: [[User:Headbomb/unreliable.js]]
to the page (you may need to create it), like
this.Once installed, you can go to User:Headbomb/unreliable/testcases to see if it works.
The script breaks down external links (including DOIs) to various sources in different 'severities' of unreliability. In general, the script is kept in sync with
and common sense "duh" case I come across (like a parody website) with some minor differences.
Severity | Appearance | Explanation |
---|---|---|
Blacklisted | example.com | The source is blacklisted on Wikipedia and can only be used with
explicit permission. Due to the large amount of blacklisted sites that have effectively been purged from Wikipedia, only those listed at
WP:RSPSOURCES are highlighted. Only in extremely exceptional circumstances should those links be allowed to remain, typically only on articles about said source. For example, a link to
Breitbart News is appropriate on the Breitbart News article and pretty much nowhere else. Note: Some blacklistings have a time component, like Lenta.ru (blacklisted from 2014 onwards). The script cannot tell if an article is from before or after the time of blacklisting, and so will highlight all cases. |
Deprecated/ predatory | example.com | There is community consensus to
deprecate the source. The source is considered generally unreliable, and use of the source is generally prohibited. This includes a slew of
predatory publishers and journals,
propaganda,
fake news, and other terrible sources of information. The source should only be used in exceptional circumstances, similar to blacklisted sources, but these circumstances are not as rare. Note: Some deprecations have a time component, like journals from the formerly-reputable Pulsus Group, acquired by the predatory OMICS Publishing Group in 2016. The script cannot tell if an article is from before or after the time of acquisition, and so will highlight all cases. |
Generally unreliable | example.com | The source has a poor reputation for fact-checking, fails to correct errors, is
self-published, is
sponsored content, presents
user-generated content,
violates copyrights, or is otherwise of low-quality. The source should generally be avoided, but context matters a lot here. Note: In the case of user-generated content and social media, first check who the user/account is ( Randy in Boise vs NASA official account). |
Marginally reliable | example.com | Sources which may or may not be appropriate for Wikipedia. For instance
Forbes.com is
generally reliable, but its contributors
generally are not. This category will include
preprints, general repositories which can host preprints and predatory journal articles (e.g.
Academia.edu and
ResearchGate), general book repositories which can include self-published books (e.g.
Google Books and
OCLC), as well as sources which may-or-may not fail
WP:MEDRS (or
WP:BLPSOURCES) but which may be acceptable for other types of claims. (This section is under development, and not all marginally reliable sources from
WP:RSPSOURCES are currently detected.) Note: This is where using your brain matters the most as these sources are generally the least problematic and may not even be problematic at all. This is mostly a double-check this reminder, rather than a probably should be removed warning. |
If you see a source that should be highlighted but isn't (or shouldn't be highlighted but is), first let me know on the talk page, along with the relevant website or DOI. But since I do not want my opinion to be king, I maintain a general policy that everything is appealable at WP:RSN, in case of mistakes, accidental misclassifications, etc.
<ref>Gawker... <ref/>
followed by The New York Times on 15 September and several other outlets afterwards. However, it could also constitute a violation of
WP:DUE, of
WP:PRIMARY, of
WP:BLPSOURCE, and many other policies and guidelines. Compare these two situations, using the same hypothetical sourceSecret CIA experiments like MKUltra successfully indoctrinated the French and Italian prime ministers in the 1970s. Evidence for these this is secretly held in Area 51. It is suspected that the current heads of state of Lesotho, Canada, and Finland are under currently influence, but experts' opinion vary on whether or not the CIA truly controls them, or if the Reptilians or perhaps even Zeta Reticulans are using the CIA as a proxy. [1] |
|
Conspiracy theorists like John Smith often claim that CIA "mind control" experiments have indoctrinated heads of states. [1] |
|
<ref>Gawker... <ref/><!-- See [[Talk:Article#Should we use Gawker?]]-->
.|arxiv=
/|biorxiv=
/|citeseerx=
/|ssrn=
parameters of the various {{
cite xxx}}
templates are obviously not problematic for reliability, so long as the citation itself isn't problematic. Citations to preprints will often be acceptable for routine claims as
self-published expert sources, but they will invariably fail
WP:MEDRS nor will they meet a
higher standard of sourcing, as preprints are not peer-reviewed (or will reflect a state prior to peer-review). Keep in mind that several papers hosted on preprint repositories will have been published in peer-reviewed venues (and some of those papers are even technically
postprints), so you should always investigate rather than assume that something is unreliable simply because it's on a preprint server. You may simply need to update things to a proper {{
cite journal}}, rather than a {{
cite arxiv}} or {{
cite biorxiv}}.This section in a nutshell:
|
The script only operates on
<ref>...</ref>
tagsThat is, it will detect links to Deprecated.com, as well as references and list items that mention Deprecated.com, but it won't recognize other mentions of Deprecated.com in the text. In practice, this means that all URLs are checked (regardless of where they are), as well as all lists of publications/bibliographies/references that follow a regular format (including those in the Further reading/ External links sections).
John Smith (2014). "[https://www.deprecated.com/article Article of things]". ''Deprecated.com''. Accessed 2020-02-14.
John Smith (2014). "Article of things". Available on ''Deprecated.com''. Accessed 2020-02-14.
The script can easily classify
DOIs by their
DOI prefixes, which correspond to various
registrants (for instance
10.4172/...
belongs to the
OMICS Publishing Group). Most registrants are
publishers, some are individual
journals.
The script can also classify DOIs through "starting patterns", but this is trickier. For example,
Chaos, Solitons, & Fractals has DOIs like
doi:
10.1016/j.chaos.2018.11.014 or
doi:
10.1016/S0960-0779(09)00060-5. These have starting patterns of 10.1016/j.chaos.
and 10.1016/S0960-0779
, which will not match other journals. However, this is very tricky to determine, as those patterns can vary over time, and can also be hard to recognized as meaningful patterns (here S0960-0779
is related to the
ISSN of that specific journal, and isn't just a random
string like
doi:
10.1023/a:1022444005336). They could also be so closely related to the patterns of other journals to cause a
collision.
Because the script is looking for strings that correspond to URL domains anywhere in the url, it could match the urls of other websites. For example, the script cannot distinguish between
[https://www.deprecated.com/MOON-CHEESE-CURES-CANCER.html Moon cheese cures cancer!]
→
Moon cheese cures cancer![https://www.alexa.com/siteinfo/deprecated.com Deprecated.com traffic analysis]
→
Deprecated.com traffic analysisBoth will be highlighted as 'deprecated', even though Alexa.com is not.
Because the script is looking for sources that are often/generally problematic in some way, sources that are generally acceptable (e.g. Toxicology Reports) will not get flagged if they are misused or if a certain article is not reliable. For example,
is a retracted paper. The script has no way of knowing that this is the case, and thus will not flag the paper as problematic.
Likewise an op ed published in the New York Times is only as reliable as the contributor that wrote it, but the script has no way of knowing that the article is an op ed or a regular article, and will not get flagged. Likewise, a "regular" New York Times article might also failed higher sourcing requirements like WP:BLP or WP:MEDRS. The script will also not flag those.
Likewise, if no URL/DOI is provided, the source will get not flagged. For example, the following paper
is from an iMedPub journal, a subsidiary of the (in)famous OMICS Publishing Group predatory publisher, and is not getting flagged because of a lack of recognizable URL/DOI.
For technical reasons, it will also sometimes highlight entire comments made in
ordered and unordered lists for (i.e. comments that start with *
or #
):
Keep When searching for sources, I found something on Deprecated.com that would indicate that the Foobarological Remedies are responsible for over 25% of remissions. This should count for meeting WP:N. User:Example ( talk) 17:29, 19 August 2020 (UTC)
- Actually, that site is not a reliable source, and does not established notability, much less efficacy per WP:MEDRS. User:Example2 ( talk) 18:29, 19 August 2020 (UTC)
This can be avoided by giving the actual link, in which case only the link will be highlighted
- Keep When searching for sources, I found something on Deprecated.com that would indicate that the Foobarological Remedies are responsible for over 25% of remissions. This should count for meeting WP:N. User:Example ( talk) 17:29, 19 August 2020 (UTC)
- Actually, that site is not a reliable source, and does not established notability, much less efficacy per WP:MEDRS. User:Example2 ( talk) 18:29, 19 August 2020 (UTC)
You can also use a [.] instead of a dot to suppress the highlighting.
- Keep When searching for sources, I found something on Deprecated[.]com that would indicate that the Foobarological Remedies are responsible for over 25% of remissions. This should count for meeting WP:N. User:Example ( talk) 17:29, 19 August 2020 (UTC)
- Actually, that site is not a reliable source, and does not established notability, much less efficacy per WP:MEDRS. User:Example2 ( talk) 18:29, 19 August 2020 (UTC)
Another alternative is to use :
instead of *
or #
to start the comments.
- Keep When searching for sources, I found something on Deprecated.com that would indicate that the Foobarological Remedies are responsible for over 25% of remissions. This should count for meeting WP:N. User:Example ( talk) 17:29, 19 August 2020 (UTC)
- Actually, that site is not a reliable source, and does not established notability, much less efficacy per WP:MEDRS. User:Example2 ( talk) 18:29, 19 August 2020 (UTC)
However, this will cause accessibility problems, suppress such highlighting in the entire comment chain (including future comments), and will cause the script to not warn you (or anyone else) that a problematic source was mentioned (including on later comments), so this method is not normally recommended.
This section in a nutshell: This section is for advanced users looking to create custom rules tailored to their specific needs, most people don't need this. If you think a problematic source that isn't currently covered by the script should be, make a request on the talk page so that everyone using the script can see it flagged as problematic. |
It is possible to define your own set of additional rules (so that, for example, you could test a new rule locally before proposing it). These rules will be applied after the default rules, so if a link matches both a default rule and a custom rule, only the default rule's formatting will be applied. To add custom rules, create the page Special:MyPage/unreliable-rules.js and add the following:
unreliableCustomRules = [ { comment: 'Name of the rule', // Will show as a tooltip regex: /regex rules/i css: {CSS style to apply to links that match the rule}, filter: (filter to use for the rule, optional), }, ];
See the section below for concrete examples. You may add additional rule blocks by copying and pasting the code between the curly braces multiple times. Make sure that the closing curly brace has a comma after it. You can also look at
User:Headbomb/unreliable.js for other examples (search for the phrase var rules
—your custom rules should be formatted the same way).
For example, if you do not wish to have Google Books links highlighted in yellow, you can add the following to Special:MyPage/unreliable-rules.js
unreliableCustomRules = [ { comment: 'Plain google books', regex: /\b(books\.google)/i, css: { "background-color": "" } }, ];
and the background colour #fffdd0 will no longer be applied. If you instead want to change Google Books links to a different color with a red border, like #7cfc00, then use
css: { "background-color": "#7cfc00", "border":"2px solid red"}
instead of
css: { "background-color": "" }
in the above example.
If you have a specific source that needs to be added, you should generally ask for it to be added on the talk page of the script (if obvious) or WP:RSN (if consensus is needed), this way everyone using the script can benefit from its detection. However, if the source doesn't warrant being flagged by the script for everyone, but you'd like it to be flagged for you (for example, Biodiversity Heritage Library and ChemSpider links), you can create your own rules by adding the following to Special:MyPage/unreliable-rules.js
unreliableCustomRules = [ { comment: 'Biodiversity Heritage Library', regex: /\b(biodiversitylibrary\.com)/i, css: { "background-color": "#40e0d0" } }, { comment: 'ChemSpider', regex: /\b(chemspider\.com)/i, css: { "background-color": "#d8bfd8" } }, ];
and these links will be highlighted in #d8bfd8 and #40e0d0 respectively.
While I ( Headbomb) came up with the idea for the script and am the person maintaining it, the basic script was designed by SD0001 with refinements by Jorm and creffett. Anything clever in the code is from them. I'm mostly just maintaining the list of sources to be covered.
|
|