This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the
current talk page.
Per a request at
Wikipedia:Village pump (technical)#Popular user scripts,
SD0001 created
a script to update the table at
Wikipedia:User scripts/Most imported scripts. While it's great that the table can now be updated without too much effort, the ongoing work of keeping this table up to date on a periodic basis seems like a task better suited to a bot than a human. Plus bots have the higher query limit, and shouldn't mind waiting a bit longer for results in order to lessen the sever load of ~1700-1900 API calls (e.g. making sequential API calls and/or using an appropriate maxlag setting). - Evad37[
talk13:45, 18 November 2019 (UTC)
Is there a way the bot could differentiate active from inactive users,
Evad37? (I suppose differentiating admins from editors would be trivial.) Also, if the bot could also spit out the change in number of imports, so that we could rank by trending scripts, that'd be a bonus!
Guarapiranga (
talk) 01:46, 20 November 2019 (UTC) —
diff
These are theoretically possible, but might be prohibitive in terms of the number of API calls required (which is already huge just for the basic count). - Evad37[
talk14:56, 20 November 2019 (UTC)
I'm not sure if a bot is really needed for this, but am happy to set up one if desired. Though there are ~1800 API calls, the responses are
tiny for each one, and these are search queries - which are inexpensive because of search indexing. The API calls aren't all being sent parallely - a concurrency limit of 50 is being applied. Maxlag settings are necessary only for long and drawn-out bot tasks, right? Here, the script takes less than a minute to execute. Regarding
Guarapiranga's suggestions, differentiating active users isn't feasible using the API - it probably could be handled using database queries - but I'm not familiar with them. OTOH, noting the change in number of imports from the last update is doable.
SD0001 (
talk)
19:03, 20 November 2019 (UTC)
Even a concurrency of 50 may be a bit much;
mw:API:Etiquette recommends serial queries. Maxlag is also recommended for all (non-interactive) requests, not just "long" ones.
Anomie⚔12:45, 21 November 2019 (UTC)
It is possible to use the API to determine if a user is active – if you approximate active as being at least X edits in the past Y days – with
mw:API:Usercontribs. That of course means getting the full results set for the search queries (instead of just the total hits) and extracting the username from each result. And it would increase the number of queries by at least 1 order of magnitude. - Evad37[
talk05:04, 22 November 2019 (UTC)
If desired, could also use list=allusers with auactiveusers. I believe that's just 1 edit in the last 30 days, but it's something. It returns the number of actions for each user, so it could also be subsequently filtered for higher definitions of active. I think you can also limit the results to just autoconfirmed or extendedconfirmed users. ~ Amory(
u •
t •
c)11:07, 22 November 2019 (UTC)
API:Usercontribs is weird. We'll need a separate query for every single user - but there are about 80,000 users in the present table. If we give the API multiple users in a one query, it lists all edits by one user before beginning another user - there appears to be no option to make it sort results by timestamp rather than by user.
auactiveusers only lists out all active users, there is no option to list active users from a given set of users. We would need to first get the list of 138,000 active users (28 queries), and then we've to pull in the full search results (which can be done in 1 query for each script except the top 2 scripts - which need 2 coz of >5000 installations) and check how many of these users are in our list of active users. This does seem just about practical enough.
SD0001 (
talk)
17:57, 23 November 2019 (UTC)
Bot for Simplifying Medical Jargon
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.A summary of the conclusions reached follows.
Not a good task for a bot.; if you want to discuss non-bot ways of doing changes, you should do so on a more appropriate page.
Anomie⚔03:32, 27 January 2020 (UTC)
Hello, I was wondering if it might be possible to construct a bot to streamline the otherwise tedious task of fixing numerous (thousands) of Wikipedia articles that use certain medical jargon terms and revise them to their more widely understood counterparts. For example, I would propose a bot that changes the phrase "renal failure" to kidney failure (except on pages when they are a part of quotes or in the title of a research article that is being cited, if a bot can be programmed to screen for those exceptions). There are several other examples and programming is foreign to me. Please let me know if this is a viable idea for an otherwise tedious (and herculean) task. Thank you!
TylerDurden8823 (
talk)
00:46, 26 January 2020 (UTC)
Not a good task for a bot. This sort of task falls under
WP:CONTEXTBOT. Word replacements, like automatic spell checking, typically have a higher-than-desired chance of causing more problems than they fix. --
AntiCompositeNumber (
talk)
01:34, 26 January 2020 (UTC)
Okay,
AntiCompositeNumber. Also, just for feedback-it's not particularly constructive or kind to say "bad idea". It's unnecessary and demeaning. Instead, perhaps just say this probably won't work because...as you did above, and skip the negativity in the edit summary. That doesn't feel very collaborative. Thanks.
TylerDurden8823 (
talk)
05:00, 26 January 2020 (UTC)
When I use a pre-formed message template, like {{BOTREQ}}, my edit summary usually ends up as the key for the message I used. In this case, that's {{
BOTREQ|badidea}}. I'm sorry that it came across as demeaning, that wasn't my intention. --
AntiCompositeNumber (
talk)
05:06, 26 January 2020 (UTC)
Regular Wikipedia articles on medicine are generally still written at a higher than desired reading level and many of them require further simplification of jargon for a general audience (a target of ~ an 8th grade reading level). The Simple Wikipedia suggestion doesn't really help me in this case, but thanks. Also, thank you for your reply Anti.
TylerDurden8823 (
talk)
23:21, 26 January 2020 (UTC)
@
TylerDurden8823: You might consider using
Wikipedia:AutoWikiBrowser which is semi-automated. It will do simple search and replaces automatically, but the user has to check each one to make sure they are appropriate and can reverse them with a click before saving. If you want help setting that up, let me know.
SchreiberBike |
⌨ 23:28, 26 January 2020 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
This bot is unfortunately down again and both of the users who maintained it have departed the project. without the bot there is no on-wiki record of UTRS appeals, leaving the system ripe for
WP:FORUMSHOP abuses. At the very least if the "notify user" functionality could be replicated, that would be great.
Beeblebrox (
talk)
21:25, 4 December 2019 (UTC)
Huh. I thought UTRS itself verified emails or something for this sort of reason. Well that blows, but thanks for your replies.
Beeblebrox (
talk)
01:47, 10 December 2019 (UTC)
Archiving live links to SR/Olympics before it goes down
Hello. I wasn't sure where to put this as this is a request in regards to a live website that's going down in a few months. SR/Olympics will be closing by
March 2020. On Wikipedia, there's
945 articles that are using this website url, with 2
here and 391 more
here (might be duplicates). I feel that InternetArchiveBot nor WaybackMedic would be suitable for this request as the links aren't dead yet. Should a bot archive these links before they break or wait? --
MrLinkinPark333 (
talk)
19:25, 6 January 2020 (UTC)
Is it possible for someone to search the English Wikipedia and create a list of articles that do not follow
WP:BOLDAVOID? Specifically, the search would need to find if there is any use of linking ([[ ]]) within the bolded (''' ''') portion of the first sentence of the article. If this is possible, would you be able to provide an output list of the linked article names here:
User:Gonzo_fan2007/BOLD. Cheers, « Gonzo fan2007(talk) @ 21:50, 27 January 2020 (UTC)
Someone's already got a bot for that, if I remember correctly. Don't remember who it is or I'd check to see if they're still operating.
Primefac (
talk)
14:51, 28 January 2020 (UTC)
Bot to add signature and timestamp to hundreds of user talk page messages added by Geregen2
Geregen2 has added hundreds of "proposed deletion" messages to
User talk:WildCherry06 and other user talk pages without signing any of them with four tildes. I suggest that a bot add a signature (Geregen2's signature, not the bot's signature) and timestamp to the end of all those messages using {{subst:
unsigned}}.
GeoffreyT2000 (
talk)
15:41, 19 December 2019 (UTC)
Not sure if a bot just for that is necessary. Signing is somewhat important but hassling everyone with an extra bot edit does not sound like a good idea to me.
Jo-Jo Eumerus (
talk)
16:32, 19 December 2019 (UTC)
The first part should be relatively uncontroversial: it is to generate a page (for instance,
User:Tigraan/Exxx redirects) containing a list of all pages which are redirects and whose title matches the regexp E1?[0-9]{3}[a-j]?. (If there is an easy way that I could do it myself, please enlighten me.) Bonus points if the page contains the current redirect targets as well. I estimate this would be around 1000 pages.
The third part would be to clean up after the RfD, either by untagging and leaving things in place if rejected, or by retargeting the redirects according to a relatively simple scheme.
TigraanClick here to contact me17:49, 23 January 2020 (UTC)
@
AntiCompositeNumber: thanks for the quarry request, especially the
modified version which eliminates quite a few false positives (such as
E112). I downloaded the CSV. I checked all titles and ~20 articles and it all looks in order.
This is a very large nav template and placement in an article will matter, a lot. Placement probably shouldn't be automated, and it probably should be collapsed by default. --
GreenC15:48, 22 January 2020 (UTC)
Bot for merging Russian locality permastubs
After a discussion
here a couple weeks ago, there was a rough local consensus that it might be beneficial to merge the majority of Russian rural locality articles (95% of which are two-line permastubs) to list articles (currently these lists are by first-level division, such as
List of rural localities in Vologda Oblast). As you can see on that article,
Fram is in the process of merging the pertinent information from the individual stubs into tables, but it's tedious work and there's something on the order of like
10,000 or so of such articles.
If the stubs in question have all an identical structure, it could work. Otherwise we might get
WP:CONTEXTBOT issues. That said, did that merger proposal get more widely advertised than just the user talk page you link there?
Jo-Jo Eumerus (
talk)
07:02, 29 November 2019 (UTC)
Not hugely, but to be honest, the total interested audience for these stubs is basically
Nikolai Kurbatov, their creator,
Ymblanter, as probably our most prolific Russia-focused editor, Fram, who came across them and proposed the merge, and myself, because I maintain the "List of rural localities in X" articles. ♠
PMC♠
(talk)07:15, 29 November 2019 (UTC)
I would say if one can add the stubs with identical structure to the lists it would be already very useful. Everything else can be done manually (or not done at all, we have quite a few fully developed articles on the topic).--
Ymblanter (
talk)
07:45, 29 November 2019 (UTC)
Yes, it would be a great help if a bot could do this. The only harder parts are getting the population information from the article, and "deciding" whether to redirect the article or whether to keep it as a standalone article. Perhaps the bot can use some measure of the length of the article and do a cut-off based on this? It's a redirect, so any errors in this regard can be easily reverted by anyone.
Fram (
talk)
07:54, 29 November 2019 (UTC)
I wonder if the bot could scrape the info onto a sub-page or a draft page of some kind to be checked by humans before being mainspaced. That way we can make sure the info is getting properly, er, populated. ♠
PMC♠
(talk)07:56, 29 November 2019 (UTC)
Among
moth articles, and I suspect many others, there are sometimes template links to Wikispecies and Wikimedia Commons but there's nothing at the target location in the sister project. I'd love to see a bot which could go through and check these and remove the deceptive templates.
Even better if it could remove links to Commons if the only file in Commons is already in use in the article.
Another refinement would be to change from a general Commons link to a Commons category link when that exists.
I know this is going to be quite a bit of work, however I feel it will have significant value once the process has caught up.
I refer to the error cat "Tidy bug affecting font tags wrapping links (4,275,998 errors)" as at today, some of which date back to 2006.
As an example the followong sinature;
[[User:AndonicO|<font face="Papyrus" color="Black">'''A'''</font><font face="Papyrus" color="DarkSlateGray">ndonic</font><font face="Papyrus" color="Black" size="2">'''O'''</font>]] <small><sup><font face="Times New Roman" color="Tan">[[User talk:AndonicO|''Talk'']]</font> | <font face="Times New Roman" color="Tan">[[User:AndonicO/My Autograph Book|''Sign Here'']]</font></sup></small>
has various errors that may cross various error categories and will never be fixed as per current methodology, a bot that does a simple find and replace, with something like;
[[User:AndonicO|talk]] signature adjusted by lint bot for lint errors.
would fix every instance of each signature as identified and could cover many instances in order, this is especially important for these aged and non-active users, and could also be used to identify current user signatures with errors and we could offer a reformatted signature solution.Thoughts
121.99.108.78 (
talk)
00:03, 28 January 2020 (UTC)
I don't think there is consensus to drastically reformat editors' signatures in the way that is proposed here. I have been performing edits
like this one that fix Linter errors without changing the rendering of the signatures, and I have had no negative feedback, as far as I can remember. At least one bot,
Wikipedia:Bots/Requests for approval/Ahechtbot 2, has been approved to perform a limited set of Linter-related fixes to talk pages. It is possible that
Ahecht would be willing to run a bot with a broader set of fixes. –
Jonesey95 (
talk)
01:21, 28 January 2020 (UTC)
I don't see that there's an absence of consensus either. Past objections to 'fixing' signatures were mostly because the proposed 'fixes' were based on ill-defined personal preferences criteria like changing something like <b>...</b> to '''...'''. Lint errors are a clear criteria. This would probably have consensus, although that's still not a guarantee. Basically, take it to
WP:VPR and see how the dice lands. Headbomb {
t ·
c ·
p ·
b}01:42, 28 January 2020 (UTC)
Ahechtbot is a blunt tool. I've only been doing find-and-replace on signatures that (a) are present and identical on very large numbers of pages and (b) affect the function or appearance of the rest of the talk page, not just the signature itself. For the example above, since you're not getting bleedover to other parts of the text, it's not really worth the overhead and extra edits. Frankly, this should be labeled as a Tidy bug, and it should be fixed so that <font>[[link]]</font> works just as well as [[link|<font>link</font>]]. Yes, I know that font tags are deprecated, but there are literally millions of pages that use the former format. --
Ahecht (
TALK PAGE)
14:57, 28 January 2020 (UTC)
Ahecht and others: formerly, <font>[[link]]</font> did work like [[link|<font>link</font>]]. That is, <font color="x">[[link]]</font>, and also <font style="color:x">[[link]]</font> both were processed by Tidy into [[link|<font...>link</font>]] (piped appropriately, of course). The font tag had to immediately wrap the Wikilink or External link, otherwise it was ignored. The font color, but not the font style, is detected as the Tidy font link bug, but the font style version of the Tidy font link bug is quite rare. Tidy has been replaced, so now font coloring tags immediately wrapping a Wikilink or external link are overridden, as you would logically expect, by default link colors. The replacement parser is an
HTML 5-compatible upgrade from Tidy and we are not going back.
Wikipedia:Linter#How you can help was written November 23, 2017, and since it was first written it has always said that it is OK to fix lint errors, including on talk pages, but one should "[t]ry to preserve the appearance." So, for more than two years, it has officially been OK to de-lint user signatures, preserving the appearance, and this has never been officially challenged or disputed; it is the consensus. (However, I don't think there's a consensus on systematic lint fixing by bot.) The Tidy font bug is a high priority lint error and I would favor fixing these lint errors in a systematic way by bot, taking care, of course, to exclude talk page discussions where fixing an instance of this error would confuse a question about this exact behavior. —
Anomalocaris (
talk)
02:22, 3 February 2020 (UTC)
Hi all, I saw that "
peer reviews" are now included on
WP:Article alerts (yay!). Unfortunately it turns out there's more than a few reviews that editors either haven't been opened properly. These will clog up article alert lists and I was wondering if I could have some help with a bot to process them (or even generate a list to give me).
I suggest that all such unopened peer reviews not opened within the last week be simply removed from the talk page with the summary "remove unopened peer review"
I can manually remove this, but as a repetitive action that may take some time I'd be very grateful if a bot could do it for me :)
If there is a way to preserve the code of this, I can keep a link in the
WP:PR archives so it can be run ever year or so.
Tom (LT), According to
https://quarry.wmflabs.org/query/42330, there are 42 pages transcluding {{peer review}} without a corresponding WP:Peer review/<title>/archive page right now. Unless misfiled peer review requests are more common than that query indicates, it seems like this is something human editors can handle. You can re-run the query in the future by logging in to Quarry, hitting Fork, then hitting Submit Query. --
AntiCompositeNumber (
talk)
17:48, 21 February 2020 (UTC)
Good point, we're working on it over at WP:PR, and this is less than I thought! Your query is very useful. Happy for this bot request to be taken down. --
Tom (LT) (
talk)
06:30, 29 February 2020 (UTC)
We're in the process at
WikiProject New York (state) of converting 5 WikiProjects to taskforces. Specifically the following are being consolidated, under the statewide banner:
However, we can't just convert the existing templates to wrappers and have AnomieBOT substitute them without creating a mess of duplicates, because some pages are tagged by more than one subproject or are already tagged with {{WikiProject New York (state)}} in addition or both, e.g.
Talk:Albany, New York.
I don't want to take the time to do the scripting just yet unless it's necessary, going off the assumption that this has been done frequently enough that there's already a working version for project mergers. I'll just take a quick minute here to give some basic examples to avoid confusion.
{{WikiProject Capital District|class=c|importance=mid}}
Output
{{WikiProject New York (state)|class=c|importance=|Capital=yes|Capital-importance=mid}}
Case
{{WikiProject New York (state)|class=c|importance=low}}
{{WikiProject Capital District|class=c|importance=mid}}
Output
{{WikiProject New York (state)|class=c|importance=low|Capital=yes|Capital-importance=mid}}
Case
{{WikiProject New York (state)|class=c|importance=low}}
{{WikiProject Capital District|class=c|importance=mid}}
{{WikiProject Hudson Valley|class=c|importance=high}}
Output
{{WikiProject New York (state)|class=c|importance=low|Capital=yes|Capital-importance=mid|Hudson=yes|Hudson-importance=high}}
Case
{{WikiProject Capital District|class=c|importance=mid}}
{{WikiProject Hudson Valley|class=c|importance=high}}
Output
{{WikiProject New York (state)|class=c|importance=|Capital=yes|Capital-importance=mid|Hudson=yes|Hudson-importance=high}}
I'm not particularly active around here, but I should be around for an hour or two more today; I will try to find time at least once every 48 hours this week to log in and do some work, so hopefully I'll be able to respond to any inquiries reasonably promptly, thank you. (please
ping on reply)
Bot to update number of Duolingo users doing a course
Hi there! I’ve been editing the Duolingo Wikipedia article to keep it up to date with the number of learners on each course. I was wondering if there’s a boy that could update the lists daily, rather than having to do it myself, or how I could create such a bot? Thanks! :-) — Preceding
unsigned comment added by
CcfUk2018 (
talk •
contribs)
03:28, 22 January 2020 (UTC)
I'm not sure that's the kind of minor statistical data that Wikipedia needs, let alone needs updated on a daily basis. ♠
PMC♠
(talk)15:02, 22 January 2020 (UTC)
@
Sdkb and
Spaced about: I will not count the articles listed in level other than current page, to prevent from double counting. Please tell me if it is better counting the articles still. Thank you --
Kanashimi (
talk)
11:40, 23 January 2020 (UTC)
VA5: Sports, games and recreation is completely wrong right now. Also, I think it'd be better to list an article's icon for current quality status always first (instead of icons for peer reviews etc.) because that way they're more easily compared via skimming and they're what the vital article project is most concerned about. The icons that haven't been traditionally listed (peer review, in the news) might even be unnecessary.--
LaukkuTheGreit (
Talk•
Contribs)
15:25, 23 January 2020 (UTC)
@
Kanashimi: I am rather confused, is there an active bot that regulary updates the count of the level 5 vital articles? I saw that someone counted all of them and I'm very happy, but it's not clear how did that. The idea was to have a bot that counts the amount of articles, so we'll know when we are done with the 50,000 goal. Is there such a bot?
Fr.dror (
talk)
15:16, 2 February 2020 (UTC)
@
Kanashimi: many thanks to you and everyone else working on this. Is there functionality here to add the VA tag to the talk pages of the articles listed? It looks like that
used to be done by
Feminist's SSTbot, but that bot
now says it's been deactivated. I'm a bit confused overall why so many of the bots related to VA have stopped functioning, given that there don't seem to have been any major technical changes that might have broken them. The VA project is ongoing, and the bots that help with it are thus needed on an ongoing basis as well.
Sdkb (
talk)
06:56, 3 February 2020 (UTC)
Since it'll be reviving SSTBot task 4, hopefully it won't require writing too much new code. For articles that haven't been assessed yet and are without the VA tag, it'd probably be best to leave them unassessed rather than labeling them all start-class; it's possible to add the tag without marking the class, right?
Sdkb (
talk)
07:46, 3 February 2020 (UTC)
SSTbot 4 is quite stupid as it involves compiling article lists manually. I don't really know how to code beyond an elementary level, and I kind of just got tired of "operating" it using AWB.
feminist (
talk)
08:28, 3 February 2020 (UTC)
Hi there, re:
this permalinked discussion, could you stellar bot handlers please remove the |residence= parameter and subsequent content from articles using {{Infobox person}}? Per some of the discussions,
Category:Infobox person using residence might list most of the pages using this template. And
RexxS said:
"Using an insource search (hastemplate:"infobox person" insource:/residence *= *[A-Za-z\[]/) shows 36,844 results, but it might have missed a few (like {{plainlist}}); there are at least 766 uses of the parameter with a blank value."
I explained my objection to this proposal in the Removal section immediately below the closed discussion in the permalink above. It is not a good idea to edit 38,000 articles if the only objective is a cosmetic update. Further, there is no rush and the holiday season is not a good time to make a fait accompli of an edit to the template performed on Christmas Day.
Johnuniq (
talk)
06:40, 27 December 2019 (UTC)
Cyphoidbomb, I already have
a bot task that can handle this, but it sounds like there is some contention about the actual removal, so ping me somewhere if and when the decision about how to deprecate the param is finished.
Primefac (
talk)
16:24, 27 December 2019 (UTC)
I just want to add support for this. I'm quite tired of seeing the residence error when I do quick previews before saving edited bios.
—МандичкаYO 😜
11:02, 8 February 2020 (UTC)
This is simple and can be handled by just about any bot.
The Federal Telecommunications Institute (IFT) of Mexico made a one-character change in document URLs that will need updating. Hundreds of Mexican radio articles cite its technical and other authorizations.
They added a "v" to the URL, so URLs that were formerly
@
Raymie: In addition they now serve https only, but left no http->https redirect, and most all of the links on WP are http. This should be done by URL-specific bot because of archive URLs and {{
dead link}} tags (some may already be marked dead and/or archived that need to be unwound once corrected). Could you post/copy the request to URLREQ, there is a backlog but I will get to it. --
GreenC20:02, 8 February 2020 (UTC)
Request for change of (soon to be) broken links to LPSN
Hello.
I want a bot .
Because as I'm a student so I'm unable to be active on Wikipedia as much as it is required.
So I think if I will get a bot then when I will unable to do template like about 100 pages at that time , instead of me my bot will do that . — Preceding
unsigned comment added by
Tanisha priyadarshini (
talk •
contribs)
16:07, 7 April 2020 (UTC)
This is a page for requesting bots to do certain tasks. A bot is not something granted to you by Wikipedia, it is something you program and create by yourself in order to edit Wikipedia pages. Please check
Help:Creating a bot for more information.
Sam1370 (
talk)
00:24, 8 April 2020 (UTC)
Articles needing an infobox backlog reducer
Resolved
There are many articles (or at least enough that it would be tedious to go through and check each and every one) in the backlog (en.wikipedia.org/wiki/Category:Wikipedia_infobox_backlog) that actually do have infoboxes. I think a bot that could be submitted a category of 200-500 pages, read the wikitext of each one, and if the page has {{infobox in it, go to the talk page of that article and remove the needs-infobox=yes parameter
Firestarforever (
talk)
13:39, 28 March 2020 (UTC)
Address implicit/structural composition gender bias by bot
While it is likely impossible to automate all of these guidelines
Wikipedia:Writing_about_women, things like using last name or relationships in lede are systematic bias which can have systematic solutions. A bot attempting to do this would be AMAZING (where exceptions like Icelandic folks would be an opt out rather than opt in) — Preceding
unsigned comment added by
Icy13 (
talk •
contribs)
21:39, 25 February 2020 (UTC)
Icy13, interesting idea, but I think it's way too much of a
WP:CONTEXTBOT problem. The bot would need to be able to do the following:
Identify the article as a biography of a female - not necessarily an easy task for a bot! Gender may or may not be mentioned in categories, isn't in the infobox and just checking for words like "she" and "her" wouldn't be sufficient.
Identify the subject's given and family names - again, not necessarily an easy task, given the number of name formatting styles. I don't just speak of patronymic names like Icelandic, but surname-first family names, Spanish names which contain surnames from both parents, mononymous people, probably several more cases I haven't thought of.
Recognize problematic sentences like those you suggested.
If there's such a bot coded, it would probably have to be the kind that creates a report on a centralized page, rather than one that edits anything in the article or its talk page. Headbomb {
t ·
c ·
p ·
b}22:17, 25 February 2020 (UTC)
Maintenance tags for questionable sources
When using footnoted referencing, the task of assessing what source supports what text is complicated. A reference may be tagged e.g. {{self-published source}}, {{self-published inline}}, {{deprecated inline}}, {{dubious}} and other tags which may be applied to the footnoted reference but these are not linked to the readable content. When using <ref> tags, by contrast, we can use <nowiki><ref>{{cite [...] | publisher=$VANITYPRESS [...] {{self-published source}}</ref>{{self-published inline}} to flag both the reference and the inline citation.
That's really too much of a
WP:CONTEXTBOT here. For example,
WordPress is a venue for a lot of self-published things, but that doesn't mean it's necessarily wrong to cite them (
WP:RSCONTEXT), so applying {{self-published source}} to some of those would be flagging a problem that isn't one.
A
WP:CITEWATCH/
WP:UPSD-like solution really is the best thing here. The CiteWatch only looks for |journal=, but a similar bot could be coded to look for domains found in |url= and |publisher/website/magazine/journal/work/...=Headbomb {
t ·
c ·
p ·
b}22:22, 25 February 2020 (UTC)
Convert Non-free reduce to Non-free manual reduce for images that are not .jpg or .png
Greetings. I'm here once again to bother you all about Files!
Tagging a file {{Non-free reduce}} places it in
Category:Wikipedia non-free file size reduction requests, where
User:DatBot performs the file size reduction automatically if the file is in .png or .jpg format. However, DatBot doesn't process any other format, and therefore files in other formats need manual processing.
I am requesting a bot to, once daily, check all files in
Category:Wikipedia non-free file size reduction requests and, if the file format is not .png or .jpg, change {{Non-free reduce}} into {{Non-free manual reduce}}, so that they're more readily processed.
Bot needed to tell Wikiprojects about open queries on articles tagged for that WikiProject
I sometimes put queries on article talkpages, some get answered quickly, some stick around indefinitely and occasionally old ones get resolved. My suspicion is that my experience is not unusual, but I hope that this is a software issue and that a lot more article queries could be resolved if the relevant editors knew of them. Would it be possible to have a bot produce reports for each Wikiproject of open/new talk page threads that are on pages tagged to that project? ϢereSpielChequers09:49, 10 February 2020 (UTC)
One way that might work, but would throw a lot of false positives, would be to notify the project(s) if there is a post that has no reply after a week. Another option would be to have some form of template like {{SPER}} that could summon the bot if a user wanted more input.
Primefac (
talk)
12:50, 10 February 2020 (UTC)
@Headbomb I was assuming a new query would be any new section on the talkpage of an article tagged for that wikiproject, excluding any tagged as {{resolved}}. @Primefac I'm not sure of the false positives, other than on multi tagged articles. If that did get to be an issue it might be necessary to have people going through such a report the option to mark a section as not relevant to their wikiproject. So an article about a mountain might be tagged under vulcanism, climbing, skiing and still get a query as to the gods that some religion believes live on it. But I suspect that thhe false positives will nnot be a huge issue. ϢereSpielChequers16:16, 10 February 2020 (UTC)
"any new section on the talkpage of an article tagged for that wikiproject, excluding any tagged as {{resolved}}" given that most sections on talk pages don't need to be marked as {{resolved}} to begin with, I can't see this idea/criteria getting consensus. The signal-to-noise ratio would be ludicrously small. Taking
Talk:Clara Schumann from a few sections above as an example, that would be 39 'queries' for that article alone. Headbomb {
t ·
c ·
p ·
b}00:50, 11 February 2020 (UTC)
Clearly that article is not typical. But the most recent thread is from January this year, the previous one from last October, so a report of any new section would include it now provided new was interpreted as broadly as thirty days. If we only went back 7 days it would already have dropped off the report. In the unlikely event of needing to make the report shorter, if someone has a tool for identifying signatures it could list single participant threads. ϢereSpielChequers07:17, 11 February 2020 (UTC)
I would certainly veto such an idea. These would likely be spam levels of updates, and duplications for those that would watch the updater and also the article itself. I would assume most would unwatch the updated list of "queries" pretty quickly, which would be pointless. A better solution is to post on the wikiproject talk page if a post doesn't get enough attention.
There are also a LOT of inactive/semi active Wikiprojects that would get a lot of bot updates, for no one to read. Seems like a lot of work and edits when we could simply post something on the wikiproject talk page to gain additional input. Best Wishes, Lee Vilenski(
talk •
contribs)08:45, 11 February 2020 (UTC)
Yes lots of wikiprojects are inactive, perhaps some will be revived by having this report, others will be unchanged. The report would be a success if an increased proportion of talkpage queries get a response, 100% response rate would be nice, but this report aims to reduce a problem not to totally resolve it. As for posting things on WikiProject talkpages, that is reasonable advice to the regulars, not something we expect newbies to do, and in case it wasn't obvious, it is unnoticed queries by newbies that I worry most about. ϢereSpielChequers09:15, 11 February 2020 (UTC)
That just gives you an indication that there has been a change to a page not that there is a query that needs to be responded to on that page.
Keith D (
talk)
00:03, 12 February 2020 (UTC)
If I'm reading the intent correctly, I think this can be resolved by, alternatively, (1) using WikiProject banners to encourage editors to ask the question there instead, (2) relying on
WP:Article alerts to list all
WP:RFCs of consequence in the project, or (3) adding some tag with lower stakes than an RFC (e.g., a variant of {{help me}}) and submitting a feature request for
WP:Article alerts to track that template/tag. On the whole, could use more evidence that this is an actual problem. Agreed that it would be a lot of noise to create a listing for every new, unreplied talk page section on every project page, especially when such sections do not necessarily require responses (e.g., "FYI" messages). czar01:57, 17 February 2020 (UTC)
I'm in need of help replacing all instances of a set of WikiProject templates as taskforces of the one unified template: {{
WikiProject Molecular Biology}}. Unfortunately a simple transclusion of the new template wrapped in the old templates isn't enough, since some pages have multiple WikiProject templates, so will need to be marked with multiple taskforces. It's therefore similar to when
Neurology was merged into WP:MED.
For {{WikiProject Molecular and Cell Biology}} AND {{WikiProject Genetics}} AND {{WikiProject Computational Biology}} AND {{WikiProject Biophysics}} AND {{WikiProject Gene Wiki}}
For whichever WikiProject template has the highest |importance= and |quality=, add that as the overall |importance= and |quality= to {{WikiProject Molecular Biology}}
Additionally add to articles in the following categories:
@
Primefac: You mentioned in the earlier thread on this topic that one can use Anomiebot to merge templates using {{
Subst only|auto=yes}} template to merge one banner into another, but is there any support for merging multiple banners on a single page into 1? If not, are there any bots that have been approved to merge multiple project banners on talk pages (particularly where 2+ banners occur on a single page) into a single parent banner? Asking because I could likely modify the source code of a bot designed to merge the banners of another project's task forces for this purpose, especially if there's one written in python.
Seppi333 (
Insert 2¢)
03:45, 16 January 2020 (UTC)
@
Seppi333: You make a good point about what to put as overall WP:MOLBIO class and importance based on WP:MCB, WP:GEN etc. at
WT:MOLBIO. I think the best option is to simply use the current taskforce importance (if something's high importance to the WP:GEN taskforce, chances are it's high importance to the WP:MOLBIO wikiproject). The edge case is when two taskforces currently indicate different importance levels (e.g.
Talk:DNA_gyrase). In such cases it might be safest to use the median rounded up for the overall importance (high+low→mid, high+mid→high), but maybe that's over complicating things.
T.Shafee(Evo&Evo)talk04:45, 16 January 2020 (UTC)
Sure, I ran a bot like this last weekend. I could probably put in a BRFA today or tomorrow if I get time.
Primefac (
talk) 10:55, 16 January 2020 (UTC) I did just notice, though, that there are also sub-projects for each of the (now) sub-projects; are those tasks forces (such as
genetic engineering or
education) being handled by the replacement template as well?
Primefac (
talk)
10:59, 16 January 2020 (UTC)
@
Primefac: No, don't think so. The primary reason the Gene Wiki sub-task force was added is that it has its own banner (w/ corresponding article categories: {{
WikiProject Gene Wiki}} &
Category:Gene Wiki articles) which is currently present on ~1800 pages. I think we're probably just going to go with the current task force listing in the {{
WPMOLBIO}} template.
@
Evolution and evolvability: I added the signaling parameter for categorizing cell signaling articles;
Category:Metabolism is an article category and the metabolic pathways task force doesn't have its own category, so I couldn't add the metabolism one. Addendum, re: The edge case is when two taskforces currently indicate different importance levels (e.g.
Talk:DNA_gyrase). In such cases it might be safest to use the median rounded up for the overall importance (high+low→mid, high+mid→high), but maybe that's over complicating things.. It wouldn't be that technical to encode that. Programatically, one just needs to ordinally encode low→1, mid→2, high→3, top→4 (NB: this method implicitly assumes that there's an equal
"importance distance" in a mathematical/statistical sense between importance ratings, which might not necessarily be true - it depends on how people go about rating importance on average), then use round(median(list of ratings)) or round(average(list of ratings)), then remap whatever number it returns back to an importance rating. E.g., the average rating of task forces that rate an article as low, high, and top is (1+3+4)/3, which would be rounded to 3 → high importance.
Seppi333 (
Insert 2¢)
02:56, 18 January 2020 (UTC)
@
Seppi333: That looks correct to me! Great to see it coming together. I'll also go through the taskforce pages and relevant template documentation over the next few days to make sure the instructions for tagging new articles is up to date (
example).
T.Shafee(Evo&Evo)talk23:11, 22 January 2020 (UTC)
@
Primefac: Are you still interested in doing this? Either way, can you point me to the bot script you had in mind in the event I have a need for reprogramming it to run a similar bot in the future?
Seppi333 (
Insert 2¢)
04:42, 21 February 2020 (UTC)
When an
AfD discussion ends with no discussion,
WP:NOQUORUM indicates that the closing admin should treat the article as an expired
PROD (
"soft delete"). As a courtesy/aid for the closer, if would be really helpful for a bot to inform of the article's PROD eligibility ("the page is not a redirect, never previously proposed for deletion, never undeleted, and never subject to a deletion discussion"). Cribbing from the last discussion, it could look like this:
When an AfD listing begins its seventh/final day (almost full term) with no discussion, a bot posts a comment on whether the article is eligible for soft deletion by checking the PROD criteria that the page:
isn't already redirected (use API)
hasn't been PROD'd before (check edit summaries and/or diffs; or edit filter
if ever created)
has never been undeleted (check logs)
hasn't been in a deletion discussion before (check page title and talk page banners)
nice-to-have: list prior titles for reference, if the article has been moved or nominated under another name before
To check whether anyone has participated in the AfD, @
Izno suggested borrowing the AfD counter script's detection
This would greatly speed up the processing of these nominations. Eventually would be great to have this done automatically, but even a user script would be helpful for now. czar19:26, 29 December 2019 (UTC)
@
Czar: Is it good enough if a bot just reports these attributes for AfD expired with no discussion?
Whether the page is redirected or not
List up all previous
WP:AfD and
WP:AFU with the results.
I was thinking that a more general scoped bot which tells the AFD whether there were previous redirectings, (un)deletions and deletion discussions might be useful to inform the discussion of past changes.
Jo-Jo Eumerus (
talk)
09:23, 24 January 2020 (UTC)
@
Kanashimi, that would cover 75% of the criteria a closer needs to know (and would at least be a start!) so would need to remind the closer to check the page history for prior PRODs as well. Otherwise, yes, that's exactly what I think would work here. Essentially, if it detects positive for any of those criteria, would be nice to summarize that it's ineligible for soft deletion because of x criterion. czar13:03, 25 January 2020 (UTC)
@
Kanashimi, it's a start! I was thinking of formatting along the lines of:
Extended content
posting something like this to the AfD discussion when no one else has !voted
Note to closer: While this discussion appears to have
no quorum, it is NOT eligible for
soft deletion because it has been
previously PROD'd. (can link to the live diff instead if the PROD wasn't successful)
Note to closer: While this discussion appears to have
no quorum, it is NOT eligible for
soft deletion because the subject is currently a redirect.
Note to closer: From lack of discussion, this nomination appears to have
no quorum. This bot did not detect previous PRODs, previous AfD discussions, previous undeletions, or a current redirect, so this nomination MAYappears to be eligible for
soft deletion at the end of its seven-day listing if it has not been previously
proposed for deletion; please review the article's edit history.
In this case, wouldn't need to list the entire history but just say at a glance (or the strongest reason) why the article isn't eligible for soft deletion. Eh? czar02:46, 8 February 2020 (UTC)
@
Kanashimi, this is great! If it ran at the beginning of the 7th day of listing for
Articles for deletion/Vikram Shankar and
Articles for deletion/Anokhi, which for lack of participation would appear eligible for soft deletion, the closer would know that it's not actually the case. I'm not sure that the list of deletions/undeletions is needed for this case but open to other opinions. At the very least, pictorial image use is historically discouraged in AfD discussions. A few fixes:
The
Wikipedia:Articles for deletion/List of Greta Thunberg speeches would need a tweak. What would make it ineligible is if the existing article (under discussion) was redirected elsewhere, leaving its history in the same location, meaning that someone redirected it in lieu of deletion. In this case, the article (and its page history) was moved to a new location, so this case should check both whether the title redirects AND whether the page history remains. Page moves would still be eligible for soft deletion/expired PROD by my read.
Reem Al Marzouqi: The rationale for this should not be "previously deleted" but specifically "previously discussed at AfD", which supersedes whether or not it was deleted. (Deletion itself doesn't make the article ineligible—e.g.,
Hasan Piker and
Heed were each only deleted through CSD—but specific signs that someone has previously considered the article ineligible for PROD.) Same applies to the remaining "2nd+ nomination"s listed.
And of course there's the caveat that the script wouldn't have actually run on most of these (all but four?) since the rest had at least some participation.
Will this script catch whether the article was previously PROD'd? If not, would want to add something to the text to remind the closer to check. The rationale for
Heed (cat)'s ineligibility, for example, is that the article was previously PROD'd and contested (03:32, 1 June 2009), not that it was previously deleted via CSD. Same for
Paatti, which actually shows the PROD in the log (most do not, to my understanding).
Ayalaan actually appears eligible for soft deletion. Its deletion was through CSD and it appears to have not been previously PROD'd. As long as the script confirmed that the article was not tagged for PROD before, this would be a great case of where the script could say that the article appears eligible.
Yep, CSD/BLPPROD doesn't affect PROD/soft deletion eligibility (
WP:PROD#cite_ref-1), so don't need to track that.
If the v1 won't parse edit summaries or diffs, I've modified the collapsed section above with some suggested boilerplate. Of course, would be great if it could, but this would do for now.
It looks like all of those results would not run because the bot detects participation for each? The case of
Madidai Ka Mandir should let the bot run since the only participation is from a delsort script.
Henri Ben Ezra should let the bot run too (to post that it's ineligible based on having a prior AfD). So would need to tighten participation detection. If the bot/script is detecting one or fewer delete/redirect participations, the bot should run (e.g.,
Ayalaan and
Anokhi). Probably also want the bot to only run when the nom hasn't been relisted, or else it could potentially run twice on the same nomination.
As for the logs, related discussions, and previous discussions, I think it might be overkill to post these. It could be potentially interesting as its own bot task, if there is consensus for it, but I think simply showing "soft deletion" eligibility is sufficient for this task. I'll ask
Wikipedia talk:AfD for input. czar16:42, 9 February 2020 (UTC)
I didn't check all logs, but from the ones I did, the log analysis looks good! It doesn't look like the tests were doing "no quorum" detection, so as long as the script knows when it should run on a discussion (one or zero !votes in the last 24 hours of the AfD's seven-day listing) then sounds good to proceed to the next step/trial. Thanks! czar01:49, 17 February 2020 (UTC)
{{resolved}}
I've been using
User:Ucucha/HarvErrors.js for a few days now, and it's a pretty nice little script. However, the issues it highlights should be flagged for everyone to see and become part of regular cleanup. For example, in
Music of India, two {{harv}}-family templates are used to generate reference to anchors, designed to point to a full citation.
However, inspecting the page reveals those anchors aren't found anywhere on the page. Even a manual search won't find the corresponding citations on that page, because this isn't an issue of someone having forgotten a |ref=harv in a citation template, they just aren't there to begin with.
A bot should flag those problems, probably with a new template {{broken footnote}}, or possibly on the talk page.
This is a very good idea. Recently came across this in
Easter Island ref #113 (Fischer 2008). There is no reference for Fischer 2008. In fact the reference is a faux-Harvard <ref>Fischer 2008: p. 149</ref> Lot of permutations for Harvard reference problems that a specialized bot could become expert on. --
GreenC20:09, 21 February 2020 (UTC)
Finnusertop, Yes, that is not possible inside the template. To be able to determine if the link works or not, the template would need access to the rendered page HTML. Since the rendered page HTML is only available after the template itself is parsed and rendered, the template can't look at it. A MediaWiki extension could hook itself into the parsing chain, but
changes in the parsing system make that a bad idea, especially in the next year or so. --
AntiCompositeNumber (
talk)
21:25, 23 February 2020 (UTC)
@
Finnusertop: That would be possible, but it would require someone to maintain it. Consensus would probably be required to have it default-on as well. Pages that have broken citations that have less editing traffic would also be unnoticeable, since the script wouldn't be able to apply tracking categories. This sort of problem is more directly comparable to
dead links than missing parameters. --
AntiCompositeNumber (
talk)
21:38, 23 February 2020 (UTC)
Not only that, but turning that on would also throw pointless warnings (anchors without refs) and fail to populate maintenance categories. Headbomb {
t ·
c ·
p ·
b}21:50, 23 February 2020 (UTC)
Updating essay impact assessments
{{resolved}}
Per
this conversation, the automated
essay assessment system has fallen badly out of date since BernsteinBot stopped updating it in 2012. It would be useful to revive it so that essay readers could have a better indication as to whether the essay they are reading is more likely to represent a widespread norm or just a minority viewpoint.
MZMcBride has provided the
original code, but it will need to be updated. Your help would be much appreciated. Regards,
Sdkb (
talk)
20:19, 22 March 2020 (UTC)
@
Sdkb,
MZMcBride, and
Moxy: Considering that the original script depends on python2, wikitools, toolserver SQL, and stats.grok.se, I decided to go the full rewrite route. The score calculation is the same, but watchers and pageviews data are retrieved from the MediaWiki action API. I made a test edit
here and the code is
here. Let me know if you have any comments or suggestions, as well as how often you want the report to be updated. I can then move forward to file a BRFA. --
AntiCompositeNumber (
talk)
19:15, 14 April 2020 (UTC)
Looks good to me; thanks for this! Xeno had mentioned this in the other discussion: Keep in mind some essays get linked through maintenance templates, which can greatly inflate their inbound links (we might have corrected for this, I can't rememeber). Does the code correct for that, and if not, does it seem like there's much distortion happening because of it? {{u|Sdkb}}talk21:58, 14 April 2020 (UTC)
It does not correct for template links, and I can't think of a good way to design such a thing short of parsing the wikitext for every page with a link (not a great idea, would make the report take 10x as long). I also don't see much of such an effect, outside of a few potential outliers. The scoring system weights pageviews the highest. I took the data and plotted it
[2], might be interesting for you. Pageviews are clearly driving most of the score, but watchers also have a noticeable effect. As pageviews and watchers decrease, the effect from links increases. --
AntiCompositeNumber (
talk)
00:18, 15 April 2020 (UTC)
{{resolved}}
Greetings, I'm here to bother you all about File namespace nonsense again.
User:RonBot, which was disabled because its operator went inactive a year ago, had an approved task to reduce the display size of SVGs (
BRFA). In its absence, there's quite a pile-up of SVGs awaiting reduction (over 100 currently). I tried to reduce them manually and failed, so now I'm here asking someone else to take up the task themselves. The source code for the task is
here.
Bot or AWB to fix identical CITEREF value error in 1,500 articles
{{resolved}}
I have found 1,575 articles that have an identical referencing error. The author name listed in {{sfn}} does not match the author's name as listed in the |ref= parameter in the matching full {{cite book}} citation template, which causes a non-working link from the short reference to the full reference. It also causes a red error message if you have the relevant script enabled.
I have performed a
sample fix here. Is there a kind AWB editor or bot operator who would be willing to fix the rest?
so the {{
sfn|Gröner|1991|p=...}} should actually be {{
sfn|Gröner|Jung|Maass|1991|p=...}} and that |ref=CITEREFGröner1991 (or |ref=CITEREFGr.C3.B6ner1991 if not yet modified) should be |ref=harv. --
Redrose64 🌹 (
talk)
19:14, 9 April 2020 (UTC)
You see misuse, I see
WP:CITEVAR. Plenty of books are referred to in this shorthand way in Sfn templates in order to keep the short references short. It may not be your idea of "correct", but it is consistent throughout this batch of articles, as far as I can tell. I'm trying to get a technical problem fixed. If someone wants to come along later and change these articles' established citation style, that can be a different discussion. –
Jonesey95 (
talk)
21:36, 9 April 2020 (UTC)
It's even mentioned
in the template documentation as a valid method of shortening long lists of authors (though the doc recommends "Gröner et al", which I agree with). The documentation also recommends |ref={{
SfnRef|Gröner|1991}} over |ref=CITEREFGröner1991. I think the SfnRef way is a bit cleaner (since it matches the {{sfn}} invocation), so would probably use that. --
AntiCompositeNumber (
talk)
02:11, 10 April 2020 (UTC)
Thing is 'Gröner 1991' is actually wrong, because it should be 'Gröner, Jung & Maass 1991' or 'Gröner et al. 1991', so it's not just a matter of fixing an anchor, it's a matter to having a proper short ref and fixing the anchor. Going {{
sfn|Gröner|Jung|Maass|1991|p=...}} is the way, and people can change it to {{
sfn|Gröner et al.|1991|p=...}} + |ref=Gröner et al. if they want to manually shorten the list of authors (most style guides say keep 3, so that's why the default is up to 3 named authors, and 4+ gets truncated to et al. Headbomb {
t ·
c ·
p ·
b}02:22, 10 April 2020 (UTC)
Did you try manually going to any of the articles, for example
German submarine U-1015, and clicking on the link "Gröner 1991"? When I click on that link, it does not jump to or highlight the full citation. That is the problem I am hoping that someone will be willing to fix. I will fix them myself if necessary, but I know that AWB makes it a lot easier and less tedious. Fixing this batch of articles will fix about 5% of the total population of ‹The
templateCategory link is being
considered for merging.›Category:Harv and Sfn template errors. Thanks. –
Jonesey95 (
talk)
04:22, 10 April 2020 (UTC)
Keith D, thank you! I have cleaned up those three articles, which had redundant full citations in them. Thank you for your attention to detail. And
AntiCompositeNumber, thanks for the updated list. I will check those remaining articles. –
Jonesey95 (
talk)
15:25, 10 April 2020 (UTC)
It appears this has mostly resolved itself. Magnus's tools are some of the most highly-used on the projects, and I would encourage him to try to find some co-maintainers to assist in the operation and development of his tools. --
AntiCompositeNumber (
talk)
16:31, 22 April 2020 (UTC)
Lists of new articles by subject
My kingdom for a bot that compiles new articles in a new subject area (e.g., added to a WikiProject's scope). @
PresN, currently runs a script that does this manually (see one of the "New Articles" threads at
WT:VG) but would love to be able to do this for other projects so that new editors get visibility/help and that the project can see the fruits of its efforts. (Also discussed
at PresN's talk page.)
Special:Contributions/InceptionBot currently finds articles that might be within scope but this proposal is instead a log of recent additions to a topic area (similar to how the 1.0 project compiles). It could be useful if delivered directly to a WikiProject/noticeboard page or, alternatively, updated on a single page and transcluded à la
WP:Article alerts. czar20:07, 15 December 2019 (UTC)
@
Czar: This seems like a fairly simple task, but I want to make sure I have all the details right: For every wikiproject that opts-in, each week, generate a list of articles that had that wikiproject's tag added to their talk page within that week. Information about the article should be included, including importance and quality rating and author. Should non-articles (cats, templates, files, etc) be considered as well, or just articles? What about drafts? Are newly-created redirects important? Do you want articles that were removed from the WikiProject, deleted or redirected too (this would make it more complex)? --
AntiCompositeNumber (
talk)
20:04, 3 February 2020 (UTC)
@
AntiCompositeNumber, yes, that's right! Can detect on the addition of the template or the addition to the category associated with the WikiProject (i.e.,
ArticleAlerts uses a combination of the banner and the talk category). I'd recommend including the quality rating but excluding the importance, à la {{article status}}. I'd recommend cutting scope to only include articles to keep the v1 reasonable. (Let someone request the extras if they have a valid case, but ArticleAlerts currently lists relevant deletions and AfC drafts.) In
WT:VG#New Articles (January 27 to February 2), as an example, I personally don't find the category reports useful. The goal of this bot, to my eyes, is to make WikiProject talk pages closer to topical noticeboards, so editors interested in a topic receive a digest of new article creations to pitch in either to contribute to the article or simply to welcome new/isolated editors. So while I recommend against listing files, cats, templates, drafts, importance, deletions, or redirects, if there is any stretch goal, I'd particularly recommend incorporating
InceptionBot's possibly related articles, in case there are new articles from the last week that may be eligible for the project but just haven't been tagged. But, yes, core function is just new, on-topic articles. czar03:07, 8 February 2020 (UTC)
Czar, how does
this (for
WP:VG) look? I'll add some explanatory text to it before calling it ready, but wanted to get your thoughts on what the output looks like now. I haven't written the bot part of the bot yet, just the category analysis. The report's only based on the category and doesn't particularly care about the template since that data is much more accessible. Also, do you know of wikiprojects that would be interested in the reports? --
AntiCompositeNumber (
talk)
22:34, 8 February 2020 (UTC)
@
AntiCompositeNumber, love it. Looks great! I'd start with WT:VG just for testing and can advertise/expand it to other projects. (FYI @
PresN, curious if this test is missing any articles you'd normally catch) czar00:31, 9 February 2020 (UTC)
@
AntiCompositeNumber and
Czar: Ah, the machines are coming for my (machine's) job... Yeah, there's some differences between this bot and my script's output for the week:
Source 2 on Feb 5 is missing- my script lists this because it was previously a redirect (for 2+ years) and was converted to a 'real' article on that day
Jablinski Games on Feb 7 is missing - was created as a redirect on Jan 30 with no talk page tag (so, not in the project), then converted to a 'real' article on Feb 7 and a talk page tag added.
You have
Animal Crossing Plaza on the 2nd when it was created/tagged on the 1st, but I think that might just be a time-zone issue
You list
Candy Crush Saga as new on Feb 1, when it's years old; this appears to be because of a crazy revert war with a vandal on the talk page on Jan 31.
So, from this limited sample set, it appears the main miss is considering redirect->!redirect as a 'creation', and not discounting the 'creation' of an existing page. That said, I fully expect there to be weirdness around page moves and double-page moves as well, but those are smaller corner cases. The other major difference is that my script would have listed 17 new categories as well (in addition to listing new article deletions and redirections/moves to draft space (aka soft deletions) this week, and (none this week) new templates/template deletions). --PresN05:36, 9 February 2020 (UTC)
@
PresN and
Czar: Jablinski Games and Animal Crossing Plaza are both just time zone issues. The bot considers the last 7 full days in UTC, so Animal Crossing Plaza was tagged at 01:05 and Jablinski shows up
today.The other two are because of the data source. My tool is only querying the categorylinks database table for recent additions of the category, so it doesn't pick up redirect -> article conversions. WP 1.0 Bot gets around this by logging article metadata into it's own database, but that data isn't super accessible outside of parsing the on-wiki logs (afaict). The categorylinks table only cares about the page id, not the page title, so moves don't affect it. So while there is data for new catgorization of drafts, I won't see articles that were previously tagged and were moved to mainspace. There is, of course, data for the tagging of drafts, files, categories, etc: I'm just ignoring it. Listing articles currently tagged for AfD or PROD wouldn't be too difficult, it's just a category intersection.
Code if you're curious --
AntiCompositeNumber (
talk)
16:13, 9 February 2020 (UTC)
@
AntiCompositeNumber: OK, so assuming I'm reading this right, there's not really a good way to get un-redirects; they'd show up when the redirect is first created (which isn't ideal as most redirects never get undone, and most un-redirects are years later) but that's it. Same for draft->mainspace, but no issues with page moves. So your version would cover the majority of cases, but would miss those edge cases. That's probably fine for most wikiprojects, though- my non-data-based feeling is that it's the media projects that have the most "article created, redirected, and later re-created" occurrences, whereas projects that get less attention from eager fans don't get as many articles created prematurely.
Your code is definitely more readable than my spaghetti nonsense, though- for an example of what happens if you try to base this off of the WP1.0 bot output and then compound it by actually just parsing the html of
Wikipedia:Version 1.0 Editorial Team/Video game articles by quality log directly without any sort of api access and then make it worse by parsing top to bottom aka reverse temporal order, here's the python function that does the logic of building the list of article objects that appear to have been created in the date range given:
Extended content
defparse_lists(lists,headers,assessments,new_cats,dates,dates_needed):NULL_ASSESSMENT='----'max_lists=dates_needed*4extra_headers=get_extra_headers(headers)# Note "Renamed" headers# Initial assessmentforindex,listinenumerate(lists):ifindex<=max_lists:foriteminlist.find_all('li'):contents=_.join(item.contents,' ')offset=count_less_than(extra_headers,index)-1date=datesint(max((index-(1+offset)),0)/3)]#TODO: handles 3+ sectionsassess_type=assessment_type(contents)# Assessmentifassess_type==ASSESSMENT:namespaced_title=get_title(item,ASSESSMENT)title=clean_title(namespaced_title)old_klass=NULL_ASSESSMENTnew_klass=get_newly_assessed_class(item,namespaced_title)if(notis_file(namespaced_title)andnotis_redirect_class(new_klass)andnot(titleinassessmentsandwas_later_deleted(assessmentstitle]))):# ignore files, redirects, and mayfliesifis_category(namespaced_title):init_cat_if_not_present(new_cats,namespaced_title)else:init_if_not_present(assessments,title)assessmentstitle]['creation_class'=new_klassassessmentstitle]['creation_date'=dateifassess_type==REASSESSMENT:namespaced_title=get_title(item,REASSESSMENT)title=clean_title(namespaced_title)old_klass=get_reassessment_class(item,'OLD')new_klass=get_reassessment_class(item,'NEW')ifnotis_file(namespaced_title):init_if_not_present(assessments,title)ifis_redirect_class(new_klass):# tag redirect updates as removals, unless later recreatedifnot(is_draft_class(old_klass)and'creation_class'inassessmentstitle]):# Ignore if this a a draft-> mainspace move in 2 linesassessmentstitle]['was_removed'='yes'elifis_redirect_class(old_klass):# treat redirect -> non-redirect as a creationassessmentstitle]['creation_class'=old_klassassessmentstitle]['updated_class'=new_klassassessmentstitle]['creation_date'=dateelse:# only add the latest change, and only if there's no newer deletionif'updated_class'notinassessmentstitleandnotwas_later_deleted(assessmentstitle]):assessmentstitle]['updated_class'=new_klass# Renameifassess_type==RENAME:namespaced_old_title=get_rename_title(item,'OLD')namespaced_new_title=get_rename_title(item,'NEW')ifnotis_file(namespaced_new_title)andnotis_category(namespaced_new_title):new_title=clean_title(namespaced_new_title)ifis_draft(namespaced_old_title)andnotis_draft(namespaced_new_title):init_if_not_present(assessments,new_title)ifnotwas_later_updated(assessmentsnew_title])andnotwas_later_deleted(assessmentsnew_title]):assessmentsnew_title]['creation_class'=DRAFT_CLASSassessmentsnew_title]['updated_class'="Unassessed"assessmentsnew_title]['creation_date'=dateifis_draft(namespaced_new_title)andnotis_draft(namespaced_old_title):init_if_not_present(assessments,new_title)ifnotwas_later_updated(assessmentsnew_title])andnotwas_later_deleted(assessmentsnew_title]):assessmentsnew_title]['creation_class'="Unassessed"assessmentsnew_title]['updated_class'=DRAFT_CLASSassessmentsnew_title]['creation_date'=date# Removalifassess_type==REMOVAL:namespaced_title=get_title(item,REMOVAL)# Articlesifnotis_file(namespaced_title):title=clean_title(namespaced_title)iftitlenotinassessments:# don't tag if there's a newer re-creationassessmentstitle={'was_removed':'yes'}ifis_category(namespaced_title):assessmentstitle]['creation_class'=CATEGORY_CLASSifis_draft(namespaced_title):assessmentstitle]['creation_class'=DRAFT_CLASS# Categoriesifis_category(namespaced_title)andnamespaced_titlenotinnew_cats:new_catsnamespaced_title='was_removed'return{'assessments':assessments,'new_cats':new_cats}
I don't think so, but I'll need to sit down and look at everything again. I'll probably have a chance for that sometime in the next two weeks. --
AntiCompositeNumber (
talk)
00:11, 2 March 2020 (UTC)
Plain text to template
I think I made this kind of request several years ago, but I can't find it in the archives.
Occasionally people add text like [citation needed] or (reference needed) to articles, and these articles don't end up in maintenance categories because they're plain text instead of templates. Could someone write a bot that would go around making edits like
this, or could an existing maintenance-bot operator add this task? I'm guessing that it would be rather simple — give it a list of phrases, tell it to look for them inside parentheses and brackets, and let it loose. Of course, this isn't a one-time problem, so if this is a good idea, it ought to be made an ongoing task.
Nyttend backup (
talk)
16:35, 20 April 2020 (UTC)
It's probably best to do this with AWB or another semi-auto tool, as there are some legitimate uses of "[citation needed]" in pages (on this page,
citation needed, and links to
citation needed for example).
[3] is a good search term for the first one, then can regex search-and-replace \[?\[citation needed\]\]? with {{subst:cn}}. --
AntiCompositeNumber (
talk)
15:49, 22 April 2020 (UTC)
Hi, thanks for the help! I know there are some general maintenance tasks that AWB operators tend to look for. How do I ask that this kind of fix be added to their task list?
Nyttend backup (
talk)
19:13, 23 April 2020 (UTC)
I think that one part of the fix is to fix
User:Ucucha/HarvErrors.js so that it doesn't get fooled by the new functionality of the CS1/CS2 templates. That's no bot task. If the new functionality of the CS1/CS2 templates means that the |ref=harv parameter is no longer needed one could run a bot task to remove it.
Jo-Jo Eumerus (
talk)
08:58, 19 April 2020 (UTC)
I occasionally see this (and have done it once or twice), and it's annoying to have to un-archive when it happens. Could we get a bot to find instances of people using something like {{
DNAU|47}} and switch them to {{
subst:DNAU|47}}? (I know there are a few bots already running that substitute accidental transclusions, so perhaps one of them could be tasked to this without too much effort.) {{u|Sdkb}}talk04:16, 28 April 2020 (UTC)
As of right now, there are currently 1065 images tagged with F8 that need deleting (specifically,
thesetwo categories are severely backlogged). Is it possible for a bot to perform this kind of maintenance based on transwiki checks to ensure that the Wikipedia and Commons versions of each file match, down to maximum resolution and filesize?
ToThAc (
talk)
17:33, 24 April 2020 (UTC)
I don't think it would be a wise task for a bot. Humans have to check additional features, in particular factors such as attribution and description. You might upload an en:wp file to Commons with a very different description, and whether it's better or worse is impossible for a bot to tell. And whether the attribution is right is basically impossible to determine without human discretion.
Nyttend backup (
talk)
18:10, 24 April 2020 (UTC)
Agreed with Nyttend. Each file requires human review, and there are a limited number of admins -
user:Fastily and
user:Jo-Jo Eumerus are the ones that come to mind - that work in that namespace. It's not just "have the description and attribution been carried over properly", it's also "is this file actually allowable on Commons or does it need to be deleted there and kept locally". The latter is likely why the backlog persists. It requires specialized knowledge and a fair amount of time.
I'd be willing to do the license vetting and make any corrections to the Commons pages that are necessary, but I'm not an admin, so I can't delete the files themselves. I can, however, pre-vet files and say "yeah, these are ready for you delete" if an admin that doesn't normally work in the file namespace were willing to partner with me on that backlog.
The Squirrel Conspiracy (
talk)
22:29, 24 April 2020 (UTC)
I am inclined to agree that F8 deletions are not really the sort of thing that can be done by a bot. There is too much double-checking involved there.
Jo-Jo Eumerus (
talk)
08:57, 25 April 2020 (UTC)
A bot that categorizes (possibly tags) pages with embedded images and files that don't have alt text.
Basically the title. There are numerous articles with images (and other content) that should have alt text but do not.
MOS:ALT says that we should try to ensure that images have alt text for accessibility reasons, which is especially important for people that utilize screen readers that cannot physically see the images. In a nutshell, said bot would probably check articles that have an embedded file such as a video, music, or image. It would then add the article a maintenance category on whether or not the embed has alt-text, as well as possibly a tag to the article letting readers (including people with screen readers) that alt-text is missing.
A related idea would be the same as the above but for math markup, which should probably be tagged/categorized separately due to the technical knowledge required to translate it into English.
Chess(talk) Ping when replying
01:52, 5 March 2020 (UTC)
I've seen past discussions about ALT text and one thing that came up is that it's not always appropriate to have ALT text for an image and that ALT text is often hard to write. I am thus not sure we want to have a general maintenance tag for missing ALT text.
Jo-Jo Eumerus (
talk)
08:37, 5 March 2020 (UTC)
I agree with Jo-Jo about how hard writing one seems to be. I have seen everything from a repeat of the caption to a detailed description of the picture that mentions everything except who or what is in the image. If it it does proceed someone will want to write a guideline page giving detailed instructions about how to write one.
MarnetteD|
Talk08:42, 5 March 2020 (UTC)
Yes, if images are tagged as needing alt text, there is the likelihood that somebody will simply copy the caption to |alt= and feel that they are justified in removing the tag. No |alt= parameter is better than having a repeat of the caption. --
Redrose64 🌹 (
talk)
14:52, 5 March 2020 (UTC)
Is it possible to at least get alt text tagging for math markup? As of now many of these equations are inaccessible.
Chess(talk) Ping when replying
16:59, 5 March 2020 (UTC)
A bot and tag does not seem suitable to this problem. As it is today, the image alt tag is I believe filled by the LaTeX, which is a reasonable alt text. I would recommend improvements to MediaWiki core and the Math extension to emit a tracking category or Linter error instead. --
Izno (
talk)
00:23, 7 March 2020 (UTC)
@
Izno: The LaTeX can be questionable or even incomprehensible in many cases especially to someone not intimately familiar with the markup. For example, \and and \or are deprecated in favour of \land and \lor, which obviously can cause problems with screenreaders.
Help:Latex has a lot of examples and if you look at some of the LaTeX source for them you can see how it might be incomprehensible for a screen reader. Formatting instructions would presumably also be a pain to hear especially if there's a lot of them.
Anyways one of the main reasons for me requesting this is that I'd like to start adding some math markup alt-text myself. If it's not possible to get a bot to categorize, is there another potential way I could find a list of LaTeX equations in articles? I'm not good with coding so I'd love it if there was a way possibly with Regex or something. And if I were to do this is there some place I'd need to seek consensus before doing so? Also is there anywhere I could get a good opinion on how transcriptions of math should work?
Chess(talk) Ping when replying
00:29, 9 March 2020 (UTC)
@
Chess: I said reasonable, not any other word that would indicate that all was right in the world. :) I have no doubt the LaTeX can be incomprehensible at times.
This won't tell you which ones have alt text and which don't, but
this search is as good as any. I suspect most of those pages need them, so edit to your heart's content. If you think you might need consensus, you should ask at
WT:MATH (your question is reasonable but I don't know better than you do if others will be disappointed by your changes).
As I said, I think it would be a good idea to change one of a couple of extensions to emit categories or Linter errors instead, if alt text is desirable generally.
Help:Bug reports is the place to start for that. --
Izno (
talk)
00:40, 9 March 2020 (UTC)
Thinking about it some more, alt text for math that would avoid ambiguity would require us to standardize some method of converting mathematical expressions to words in an unambiguous and consistent way. Such a
formal language would be an unprecedented undertaking that would probably require significant expertise far beyond what I have. Especially considering the reason why math has switched to symbols for expression is that wordy statements can be impossible to parse.
Chess(talk) Ping when replying
22:51, 10 March 2020 (UTC)
Converting inline interlanguage links to use Template:Interlanguage link?
It's not uncommon that inexperienced editors will add piped inline interlanguage links to articles that exist on a different Wikipedia in order to avoid red links. This is a contravention of the MOS, as it surprises the reader, and prevents links to valid articles once they are created. Such piped links should be replaced with the {{
Interlanguage link}} template, e.g.
Special:Diff/866719019.
Is this something that could feasibly be done by a bot? Are there valid intentional uses that shouldn't be changed? (I guess it's clearer with languages that use non-Latin script, since I can't think of a good reason to pipe a foreign name under English text, but I'm not sure about those which use the Latin alphabet.) --
Paul_012 (
talk)
02:48, 13 March 2020 (UTC)
There's an idea to explore here, but there should be a fair amount of testing on this. I feel oftentimes those interwiki links should be replaced with an enwiki link (as in we have an article, but someone used another language for some reason). Also, probably should only affect mainspace/draft space. Headbomb {
t ·
c ·
p ·
b}03:55, 13 March 2020 (UTC)
Definitely something to explore, though I wonder about the scope; is this a "hundred-edit cleanup" project, or is this a "hundred-edit-per-day cleanup" project? I notice that
WP:WCW tracks (via
#45,
#51,
#53, and
#91) most of this type of issue. Only about 400 hits on the first three. The only one I would be concerned about is #91, because I glanced at a few and it looks like people are trying to use the other-language wikis as references; converting those to {{ill}} might be problematic as they would be harder to track and thus makes it a CONTEXT issue.
Primefac (
talk)
15:26, 15 March 2020 (UTC)
I did some preliminary searches and found some three thousand results for Japanese and Chinese, and a hundred or so for Thai. It does appear that there are false positives in the form of deliberate references, though these should be avoidable by excluding citations. I just realised though that there's another issue which may prevent a bot from doing this: the link might be piped to something else that isn't the appropriate English title, e.g. abbreviations and non-disambiguated forms. Fore example,
2020 in Philippine television contains the piped link [[:th:ลมซ่อนรัก (ละครโทรทัศน์)|Hidden Love]], which would need to be converted to {{ill|Hidden Love (TV series)|lt=Hidden Love|th|ลมซ่อนรัก (ละครโทรทัศน์)}}. --
Paul_012 (
talk)
18:58, 16 March 2020 (UTC)
This can be prone to breaking archive URLs and creating link rot if one is not careful. See
WP:WEBARCHIVES for a list of the archives used on Enwiki and the formats they use. The regex at
Wikipedia:Bots/Requests_for_approval/DemonDays64_Bot_2 is an example, it uses lookback to avoid URLs that are embedded in an archive URL,
User:DemonDays64 could probably help explain it. The other problem is that if you retain the tracking bits in the archive URL but remove it from the source |url= they are now mismatched and look like different URLs, other bots might pick up on that and restore the archive URL version of the source URL, since it is the authority (once the link is dead). Personally, I would bypass any citation that involves an archive URL too many complications. --
GreenC16:23, 22 February 2020 (UTC)
Tracking bits are evil, they must go away. If web archive used this evil URL in the past, that's something we have to live with, better link rot then supplying facebook with anything. Grüße vom
Sänger ♫ (
talk)16:27, 22 February 2020 (UTC)
@
Sänger: not challenging the idea (it'd be great to clean up links if there weren't side effects) but think about this: we'd be hurting Facebook by leaving them; it gives them bad data every time someone clicks one that isn't actually in the place it was supposed to be. Still would be a good idea if only the archive bots would reliably understand.
DemonDays64 (
talk)
17:48, 22 February 2020 (UTC)
If it can be done reliably without altering the generated contents, damaging links and causing linkrot, I would applaud any efforts to remove these tracking/click identifiers like
utm_source,
utm_medium,
utm_campaign,
utm_term,
utm_content,
gclid,
gclsrc,
dclid,
fbclid,
zanpid (see
UTM parameters). Similar things could be done for some other known types of links as well, for example
Google Books links as used in many citations often contain all kinds of irrelevant parameters and could, in most cases, be reduced and normalized to just the id parameter identifying the book and some page information. If this causes problems in associating archived links, we should try to work with the archivers so they improve their matching algorithms - given our good relations with them I guess this already happens at least in the case of archive.org). We might also think about adding a module to the framework of citation templates containing a ruleset for a number of known sites which would at least highlight links containing unnecessary parameters in edit preview so the parameters can be manually removed by (knowledgeable) editors even before archives are created. --
Matthiaspaul (
talk)
11:48, 17 March 2020 (UTC)
I have a working bot; its purpose is to give readers and editors alike information regarding the presence of different content in other Wikipedia language editions for the same article they are reading. This information can be used to guide the reader to content which will add to their study, and/or highlight that content in another language happens to be biased. I hope that such a bot would be used to ratchet up the level of discourse across language editions and spread useful knowledge between them.
My proposal is that the bot be allowed to add a small phrase to the 'See Also' section of a given article, such as, "The Russian edition of this article is 70% different from this edition. You can view it
here."
I assert that the bot works: its most limiting factor right now is that I only have access to 2 million characters of translation capability per month for article comparisons, which limits the bot to a relative handful of articles in output per month. You can see the code
here.
This is not a bot request--it is a request for the bot to have edit capabilities. If there is a more appropriate place for this request, please let me know.
I think you need
WP:BRFA, though one of the requirements is that you show you have consensus to perform the edits you want. I don't see that in the Village Pump discussion you've linked to.
Spike 'em (
talk)
16:56, 26 March 2020 (UTC)
Adding reciprocal merge templates
To add a merge template to the other page where only one page has had the merge template added; that is, to add reciprocal tags. This has been proposed before, and developed consensus, but doesn't seem to have been finished or the scope has been expanded too far until it becomes controversial. Rather than starting from scratch, it might be possible to resurrect
Mutleybot or to add this as a
Merge bot task, something
wbm1058 has suggested before (
Wikipedia:Bot requests/Archive 70#Removing bad merge requests). I suggest that the scope of the bot be simple, and that it not be designed to interpret merge consensus (or not), something that has been controversial in the past.
Klbrain (
talk)
07:50, 10 April 2020 (UTC)
Hi. The main resource for sourcing basic biography data on Olympians, Sports Reference, has now been
switched off. I started a recent thread about this at the
Olympic Project. There are tens of thousands of articles that source Sports Ref. However, there's quite a simple fix that can be done to stop the links from going dead. Just change "cite web" to "cite sports-reference" in the ref, as per
this example, adds the web archive link. This is per the
recent change made to the cite template by
Zyxw.
So therefore, please can a bot change anything from "cite web" to "cite sports-reference" where this is used on WP? Many thousands of article already use the latter, but even more so do not. Please ping me if you need anymore info. Thanks. LugnutsFire Walk with Me13:58, 17 May 2020 (UTC)
A bot that condenses article issue templates into the
‘multiple issues’ template
I think it would be effective to have a bot that condenses multiple “article issue” templates, such as “more citations needed” or “Missing information” into the “This article has multiple issues” so it appears as one notice instead of several consecutive notices. Users might forget to do this, or add to previously existing issue templates and forget to condense it using the ‘multiple issues’ template. I propose that this bot would apply the condensing template to any article with more than 2 notices at the top, or whatever the official guidelines are for this according to the
Manual of Style as I’m not yet sure what they say about the number of templates allowed to appear. This would be fully automated as opposed to the semi-automation of the
AutoWikiBrowser that already has this capability.
This might already exist or have been discussed, so forgive me if I’m wrong.
@
Kanashimi: Hi, that seems like something that would work. Could you optimize it for use on the English Wikipedia and then submit your request at
Wikipedia:Bots/Requests for approval?
There are many bots whose job involves making regular updates or are otherwise anticipated to make edits frequently. When such bots stop operating, it might just be because they're no longer needed and have been retired, or it might be indicative of a problem. I propose a bot that monitors the edits of other bots known to make frequent edits (those bots could be added to a category, or to one of several categories based on level of activity expected), and sends an automated alert to a noticeboard if the bot makes no edits within the expected timeframe. At the noticeboard, editors could review the alerts, marking some as no issue and placing others into a queue for repairs. (This is somewhat a follow-up to
my brainstorming from March; feel free to lmk if it's just as non-viable, but I wanted to at least throw it out here.) {{u|Sdkb}}talk17:20, 1 May 2020 (UTC)
There are bot accounts, bot operators and bot software. We could monitor bot account activity but for any given account there could be a dozen or more bots (software) associated with that single account. It might be possible to differentiate by looking at edit summaries, but the complexities of keeping it up to date would be challenging. --
GreenC17:36, 1 May 2020 (UTC)
Yeah, that is a challenge. I wish that more of a standardized notation was set up so that bots would always identify which task they were doing in the edit summary of each edit they make. As it stands, I think the more realistic goal for now would be to just focus on catching bot accounts that stop editing entirely (many cases where bots stop without being immediately noticed fall into that category anyways, I think?). That would improve on the status quo, and if the system works, maybe others in the future would feel compelled to expand it to include individual tasks. {{u|Sdkb}}talk20:03, 1 May 2020 (UTC)
@
Pppery: I wouldn't say that's always the case. Take the example above — people cared enough to start updating it manually, they just didn't know it was supposed to be done by a bot, or didn't know to take it here, or maybe assumed someone else was working on the issue. {{u|Sdkb}}talk23:04, 1 May 2020 (UTC)
In theory, bots can/should define a unique useragent. These can be queried via meta=featureusage, although doing so it tedious and would require knowing the agent for each bot. I'm skeptical of the utility of this in general, but in theory such a tool could check bots without edits and known none-or-minimal-editing that way. In practice,
User:Joe's Null Bot/source does not list a custom useragent, so it'll be MediaWiki::API/0.41 or whatever version it's using. Trivial to add. ~ Amory(
u •
t •
c)15:14, 2 May 2020 (UTC)
Throwing ideas, we could simply have a table of bots sortable by bot name, operator name, number of edits made, and by date of last edit. Then have a disclaimer at the top that several bots, like nullbots, will not make edits. Would give a good idea at a glance of which bot is active and which isn't. Headbomb {
t ·
c ·
p ·
b}14:56, 2 May 2020 (UTC)
Seems reasonable. The only question would be what to do with the inactive bots and/or blocked bots; do they share the same table or do they eventually get pulled and/or lose their bot status?
Primefac (
talk)
15:03, 2 May 2020 (UTC)
Primefac: It used to be that inactive bots would be deflagged after notifying the operator, but I don't think anyone on either side has done that chore for a while. I wonder if we should flag bots for 3 years and have the operators request extensions. –
xenotalk23:20, 2 May 2020 (UTC)
BOTPOL already says we can pull the flag after 2 years (and 1 week notice); a table like this would make the task a little easier, since it would mean not having to manually update it
like some people do. I do think we can be a little less blasé about hanging out the bot flag - if a bot is doing a one-time run, there's not much point in granting it indefinitely. However, that's something to discuss at
WP:BOTN.
Primefac (
talk)
14:58, 3 May 2020 (UTC)
@
Majavah: not a super fan of the machine-format time stamps. Any way to format them in a friendlier-to-human way, e.g. 2020-05-02T08:14:47Z → 2020-05-02 [08:14:47] (UTC) or whatever? Headbomb {
t ·
c ·
p ·
b}17:42, 2 May 2020 (UTC)
@
Headbomb: As I said, that's still a work-in-progress. I'm not sure what to do with them as I would also like to keep them automatically sortable which gives its own problems when trying to make them human-readable (must use order year-month-day... and can't use month names instead of numbers). –Majavah(
t/
c)17:45, 2 May 2020 (UTC)
How about now? The date format is still open, I'm 50-50 split between your format above and the one currently there ("02 May 2020 18:04:45 (UTC)"). Also added the edit count there. –Majavah(
t/
c)18:08, 2 May 2020 (UTC)
Not super fussy on the exact format. I'd put edit counts in the second column personally. And use <center>—</center> instead of hyphens to indicate an inexistent entry. Headbomb {
t ·
c ·
p ·
b}18:14, 2 May 2020 (UTC)
And maybe move (UTC) to the headers, instead of repeating it in each entry to save width/space. Headbomb {
t ·
c ·
p ·
b}18:16, 2 May 2020 (UTC)
Please don't recommend <center>. I have no opinion on centering text, but an obsolete HTML element isn't cool. --
Izno (
talk)
18:28, 2 May 2020 (UTC)
Whatever gets the job done. I use < center > </ center> because it's the simplest way to reliably center something that doesn't require convoluted mark up. Headbomb {
t ·
c ·
p ·
b}18:32, 2 May 2020 (UTC)
In
this version, I don't really see a need for all three columns indicating activity, and total edits doesn't seem necessary for a report on activity. Non breaking spaces are overkill (but even if they weren't, you should prefer class="nowrap" on the cells in question). I do not think we need to know HH:MM:SS to know a bot has failed or stopped operating. I'm tempted to suggest a human-readable "time since last operation", probably also to 'day' precision. --
Izno (
talk)
18:40, 2 May 2020 (UTC)
Well this isn't just to provide the minimum required information to know if yes/no a bot is active, but also to provide a general overview of the activity of bots. A bot with 3 million edits that is inactive is more interesting than a bot with 240 edits which is inactive. Agree that the exact time-of-day of those activities is probably overkill, but at the same time, there's room in the table. Headbomb {
t ·
c ·
p ·
b}18:46, 2 May 2020 (UTC)
There's room in the table for people with big monitors or skins which aren't responsive. :^). A bot with 3 million edits is no more interesting, though may be more important, than one with 240 edits. I suspect there is a more interesting metric than count of actions/edits to indicate the important ones though...? --
Izno (
talk)
19:00, 2 May 2020 (UTC)
@
Majavah: I made some tweaks
[6]. The class="center" thing messes with column widths, so I went with <center> </center> tags. The final ' of diffs should be done with {{'}} (or you could just make use of {{'}} everywhere instead of '). But this table looks pretty good to me. Headbomb {
t ·
c ·
p ·
b}01:12, 5 May 2020 (UTC)
I'm glad to see this went through — thanks
Majavah and everyone else! To follow up from
Primefac's question above, it does look like there are quite a few bots that haven't made an edit in a long time. It'd probably be good to mark those as retired or remove them from the list. Once that cleanup is done, it might be possible to add some light color coding so that e.g. the "last activity" timestamp cell of any bot inactive for more than say a month is turned red. {{u|Sdkb}}talk16:40, 6 May 2020 (UTC)
This task was previously handled by Acebot (BFRA
here), but it stopped functioning in November 2019. The manual updates done by several editors since then indicate that there is continued demand for the information in the table. Its operator appears to have retired. {{u|Sdkb}}talk17:05, 1 May 2020 (UTC)
Acerbot was running the bot on around 50-60 wiki language sites. If they applied for and received bot perms on all those sites, that is a lot of time and work. First step would be find out why Acerbot stopped running and try to get it restarted with Acerbot's established perms. --
GreenC17:48, 1 May 2020 (UTC)
There is another bot on ruwiki that generates similar data
here. Ideally data would be stored on
Commons in Tabular format then templates pull it from there, so bots don't have to run on each language wiki. --
GreenC18:10, 1 May 2020 (UTC)
Doing... - an opportunity to test out tabular data on Commons with a Lua template. If it works, the Lua module can be rolled out to other wiki languages without needing bot perms or bot edits. --
GreenC18:24, 1 May 2020 (UTC)
I've been processing the files in that category, and many of the files on Commons are copyright violations, which are deleted within hours/days of upload. It would be useful for a bot to review the files tagged with
Template:Shadows Commons and remove that template if there is no longer a file on Commons with the same name.
At any given time there are only a small number of files in that category, 30 or so, so this could potentially be done more than once a day without being very resource intensive, though once a day would be plenty useful.
The Squirrel Conspiracy (
talk)
06:43, 21 March 2020 (UTC)
Nice. Pywikibot is nifty. My only thought the stdout statements have a date/time stamp and go to a log file, in case you want to track activity. --
GreenC17:16, 3 April 2020 (UTC)
The list of vital articles gets updated on a regular basis. Sometimes page titles are changed. I think we should have a bot update the pages to make work easier for humans.
Interstellarity (
talk)
13:14, 2 June 2020 (UTC)
I will let cewbot update the list when updating the section counts and article assessment icons of vital articles. --
Kanashimi (
talk)
08:11, 3 June 2020 (UTC)
Five year old mass message to Wikiprojects not being archived
I noticed that a "Wiki Loves Pride" mass message to all wikiprojects from June 2015 is not being auto-archived by some of those projects that set up autoarchiving (such as
WT:IRAN). It is missing a timestamp. Can a bot be set up to archive all those messages that still remain on the main talkpages to the proper archives? Or can a boit be set up to to add timestamps to all the messages that remain on the main talk pages? (June 2015 timestamp) This is 5 years out of date, and seems odd to inform people to do still some thing 5 years after it already ended.
Yes, I regret not time-stamping these messages, but lesson learned. I archive these manually as I come across them. If a list of WikiProject talk pages with unarchived WLP posts were available, I'd help with archiving, but I don't know how to best help with this otherwise. ---
Another Believer(
Talk)21:43, 28 May 2020 (UTC)
Possible There's
currently an issue with the Data Services replicas from WMF Labs, but once that's sorted, I can hopefully identify a list of the pages that this template is on with some fairly simple SQL. Once I've got that list, should just be a case of running an AWB run to wrap the relevant bits in {{archive top}} and {{archive bottom}} templates - I think stuff being unsigned inside one of those doesn't cause problems for the archiving bots (although feel free to correct me on that!)
Naypta ☺ |
✉ talk page |
22:17, 28 May 2020 (UTC)
I'm quite sure archiving bots need a timestamp somewhere inside the section. {{archive top}}/{{archive bottom}} don't indicate when the section was last edited so the archiving bots won't know that the mass message needs to be archived.
Galobtter (
pingó mió)
22:26, 28 May 2020 (UTC)
@
Galobtter: Ah, I didn't remember it needed the specified bit in the bot config. Fair enough - something like (timestamp may not be accurate) {{subst:Unsigned|Another Believer|15:13, 3 June 2015 (UTC)}} on the end of each of them then?
Naypta ☺ |
✉ talk page |
22:34, 28 May 2020 (UTC)
@
Izno: From a brief look, it doesn't look like it's just WikiProject talk pages - it's also user talk pages too at least, possibly more. Now there is of course the question of whether or not user talk pages should just be left as they are if the user hasn't bothered to clean it in all that time... but nonetheless.
Naypta ☺ |
✉ talk page |
22:22, 28 May 2020 (UTC)
The IP specified WikiProject talk pages and accordingly both searches were solely in Wikipedia talk namespace. --
Izno (
talk)
22:25, 28 May 2020 (UTC)
Doing... - and I've just realised the edit summaries AWB is leaving includes the usernames of the two users responsible - to both of you, I am very sorry for the ping explosion! Helpfully, running AWB in Wine seems to hide the checkbox to turn that off, so I'm going to switch to JWB for the rest.
Naypta ☺ |
✉ talk page |
22:55, 28 May 2020 (UTC)
YDone -- cc
65.94.170.207. With apologies as mentioned above to the temporary ping-bomb for
Another Believer - fortunately it looks like the other user had a rename so they won't have got all the pings. All sorted now, things should be archiving soon :)
Naypta ☺ |
✉ talk page |
23:15, 28 May 2020 (UTC)
Naypta, Thanks for your help! Where I did received pings, I've been helping to archive my old posts as well as other outdated notifications on talk pages. I'm happy I could help a bit with this cleanup since the mistake was mine. Glad this is finally being resolved, much appreciated! ---
Another Believer(
Talk)01:26, 29 May 2020 (UTC)
Fixing broken section anchors
Wikipedia has a
very long list of section anchors that need to be repaired. To fix these broken links, it is necessary to add an {{anchor}} whenever a section's title is changed.
User:Dexbot was designed to correct these broken links, but it hasn't corrected any of them in several years. Can Dexbot be configured to fix these links again?
Jarble (
talk)
16:40, 22 May 2020 (UTC)
This defunct bot removed inappropriate uses of {{Current}} (which per its documentation is meant only for short-term use on articles receiving a high edit count) by removing it from articles that have not been edited in more than two hours. It stopped functioning I think in 2013, and since then (perhaps because it stopped) the standards have gotten increasingly lax. I propose that we bring it back (with perhaps a slightly longer edit window, at least to start). {{u|Sdkb}}talk08:54, 19 May 2020 (UTC)
I am a bureaucrat on Real Life Villains Wikia and the other bureaucrat wanted to do a category cleanup, but changed his mind about some of the categories. Unfortunately, the user he tasked with removing the categories took his job too seriously and removed them anyway even after we decided to keep them. On any page where User:Super Poison Ivy removed the categories Anti-Semitic, Anti-Christian, Bully, Islamophobes, Fascist, and Communist, I want those categories to be restored. — Preceding
unsigned comment added by
Bjanderson94 (
talk •
contribs)
This is the request page for bots running on English Wikipedia. Wikia is not affiliated with English Wikipedia, so you're in the wrong place and likely won't get anyone willing to help. However, what you're looking to do could be accomplished by AutoWikiBrowser, which you can download and run yourself.
Here is a page on Fandom about running it on non-Wikipedia projects. That page also links to
Wikipedia:AutoWikiBrowser, where you can find documentation.
The Squirrel Conspiracy (
talk)
20:13, 14 April 2020 (UTC)
Create WT: redirects according to WP: shortcuts
Would it be controversial to request a bot to create redirects in the Wikipedia talk namespace to the talk pages of the targets of redirects in the Wikipedia namespace? I've typed WT:xxx, expecting it's a shortcut given WP:xxx is, only to be disappointed it doesn't exist from time to time.
Nardog (
talk)
01:26, 24 February 2020 (UTC)
Not really. It'd create a lot of pointless ones for mostly unused redirects, but it's not like anyone will care. Should only cover those explicitly marked as {{R from shortcut}} though. Headbomb {
t ·
c ·
p ·
b}01:49, 24 February 2020 (UTC)
Should only... Why? It's not like WP: shortcuts technically exist in the main namespace, as in H:. I'd like e.g.
WT:Actors to work, even though WP:Actors isn't marked as a shortcut. (I can see an argument for avoiding shortcuts to sections, though.)
Nardog (
talk)
03:23, 24 February 2020 (UTC)
I think it would be controversial, actually. For example, a lot of WP: shortcuts are to sections or anchors within a page, and it would seem unnecessary to create corresponding WT: shortcuts for these. In addition, the talk page of a redirect is the place to discuss that redirect—it should not automatically be made a redirect to the talk page of the redirect's target. I would definitely oppose using a bot to mass-create these redirects. --
Black Falcon(
talk)21:34, 19 April 2020 (UTC)
Clean up translation template quotemarks
The {{literal translation}} and {{langnf}} templates were originally written with simple unquoted outputs of (if called with just "example text" as their argument) Spanish for example text and lit. example text. This isn't the best way to present a translated string, and in hundreds of articles users have very reasonably added quotemarks into the template calls (eg. {{langnf||Spanish|"Rich Port"}} and {{lit.|"Free Associated State of Puerto Rico"}} on the
Puerto Rico article).
Last week
User:Ravenpuff updated the two templates to include apostrophes around the translated phrase. This resulted in some articles displaying nested quotation marks, such as:-
Puerto Rico (Spanish for '"Rich Port"'; abbreviated PR), officially the Commonwealth of Puerto Rico (Spanish: Estado Libre Asociado de Puerto Rico, lit. '"Free Associated State of Puerto Rico"')
I suggested adding a {{trim quotes}} to the templates to avoid this, and Ravenpuff suggested fixing all of the hundreds or thousands of template calls in articles instead. Which sounds like a job for a bot, so here I am. A bot would simply be tasked with checking all usages of the {{literal translation}} and {{langnf}} templates (and their synonyms), to look for any argument that starts and ends with a quotation mark, and remove those marks. If an argument contained more than two quotemarks, which is perhaps plausible where an editor offers multiple translations, that should be flagged somehow.
I do not recommend putting {{trim quotes}} into a template unless you know for sure what the parameter values will be. There are too many edge cases, like the one you have already thought of. –
Jonesey95 (
talk)
14:43, 16 April 2020 (UTC)
What other options are there? Is it possible to have a template that applies the logic of "if the passed string is not already wrapped in quotemarks, wrap it in quotemarks", without being computationally expensive? Or should this be the job of a bot? --
Lord Belbury (
talk)
10:21, 22 April 2020 (UTC)
There are too many kinds of quote marks, in too many arrangements, to be sure that automated trimming would render the string correctly. You could probably use string processing of the contents to put pages in a hidden category for human inspection. –
Jonesey95 (
talk)
13:29, 22 April 2020 (UTC)
So would it work to have the template apply a string process of:
If the string starts and ends with a quotemark (or starts and ends with an apostrophe), display it unchanged
If the string contains any quotemarks at all, or any apostrophes that aren't in the middle of a word, display it unchanged and add a hidden category to flag that the template call is providing something more complex than a single literal translation
Otherwise (ie. if the string contains no quotemarks, and its apostrophes are all in the middles of words), display it surrounded by additional quotemarks
I'm not sure how I'd write that in a template, but can take a look if that seems like it wouldn't be computationally expensive. --
Lord Belbury (
talk)
14:04, 22 April 2020 (UTC)
That sounds like a fun experiment. Let me know if you set up a testcases page, and I'll come visit. Make sure to account for straight quotes and curly quotes. –
Jonesey95 (
talk)
15:06, 22 April 2020 (UTC)
@
Jonesey95: Best I can do offhand has been
User:Lord Belbury/sandbox, which needs another pass to ignore all italics markup, and I'm stumped by Lua's handling of curly quotes (I've never used Lua before): it's beyond me why a match of s:match([[([“])]]) is returning true for a single curly apostrophe. Will take another look later, would appreciate any feedback (or a pointer to a better talk page to ask for templating help).
As the result of a move request,
2019–20 coronavirus pandemic was moved to
COVID-19 pandemic. Unfortunately, there are a metric tonne of articles (and templates and categories) that have "2019–20 coronavirus pandemic" (or "2020 coronavirus pandemic") in the name. Accordingly, it would be appreciated if we could get a bot that would move all of these to the consistent "COVID-19 pandemic" name. This matter was briefly discussed in the move request, with unanimous support for consistency, and it's quite obvious that all these titles should be in line with the main article, named so only because of the previous name.
While this is a one-time request, I believe this is too time-consuming with AWB as these are title changes. But happy to be told otherwise. -- tariqabjotu03:14, 4 May 2020 (UTC)
As a note, it seems like enough people are attempting to do this manually that this may not be necessary as a bot. But, I'll leave this up anyway. -- tariqabjotu03:31, 4 May 2020 (UTC)
I'd like to see this done through a bot, just so we don't miss any and save ourselves some work. I started some discussion about general implementation
here. {{u|Sdkb}}talk04:40, 4 May 2020 (UTC)
Care should also be taken to ensure all talk page archives (or any other subpages if they exist) are moved.
Nil Einne (
talk)
06:17, 4 May 2020 (UTC)
Following up from
this conversation, I think it would be helpful to have a bot automatically apply the appropriate padlock icon to pages after they become protected. {{u|Sdkb}}talk09:39, 21 May 2020 (UTC)
Could I also mention that it would be useful to have a bot which fixes incorrect protection templates?
MusikBot removes incorrect ones, as I pointed out
here, but it doesn't replace them (and could be the cause of some of these missing templates). This seems like a related subject.
RandomCanadian (
talk /
contribs)
18:29, 21 May 2020 (UTC)
MusikBot is capable of fixing incorrect protection templates, that feature just didn't get through the
BRFA. I am willing to give it another push though, if there's demand for it. Similarly it can apply missing protection templates, I just didn't enable that since there was talk to revive the
Lowercase sigmabot task that did this and we didn't want the bots to clash. When that didn't happen, TheMagickBOT came through, but alas it has retired now too. I don't mind one way or the other, so if Naypta wants to take it on don't let me stop you, just know that the code is largely written in MusikBot. — MusikAnimaltalk18:49, 21 May 2020 (UTC)
@
MusikAnimal: If the code's already written in MusikBot, seems to me to make a whole lot more sense to just push to use MusikBot for it then if there's consensus to do this now - the lazier I can be, the better!
Naypta ☺ |
✉ talk page |
18:52, 21 May 2020 (UTC)
It definitely makes sense to have one bot do all the work regarding protection templates rather than a hodge podge of different bots.
Galobtter (
pingó mió)
18:45, 23 May 2020 (UTC)
Yeah, I've been doing a fair amount of batch protects now (which are easier using p-batch or manually using the mediawiki interface in some cases.) I'm not mass adding the ECP template though, as that seems like a waste of my time for nice but optional templates. Anyway, I thought this was still happening via another bot, so add a +1 to bring some bot back to do it (cc:
MusikAnimal if you're still willing to give this a go
)
TonyBallioni (
talk)
18:57, 7 June 2020 (UTC)
Should the bot add templates to fully-protected pages, too? The bot will be exclusion compliant, so if admins for whatever reason didn't want to advertise full protection they can use the {{bots}} template to keep MusikBot out (
MusikBot II to be precise, since it already has admin rights).
Should it do this for "move" and "autoreview" (pending changes) actions in addition to "edit"?
Cyberbot II used to handle PC-protected pages but it appears that task has been disabled.
I'm going to hold off on fixing existing protection templates for the time being, just to keep it simple. We'll get to that with a follow-up BRFA. — MusikAnimaltalk21:06, 9 June 2020 (UTC)
MusikAnimal, adding it for full-protected pages sounds fine to me. I'm not sure about move and autoreview. Thanks for working on this! {{u|Sdkb}}talk21:31, 9 June 2020 (UTC)
There is already an existing score parameter that will determine if a team wins or loses a match. This w/l parameter deemed dubious and redundant, hence score parameter must be taken advantage to assess the win-loss logic instead.
Scenarios for bot actions according to existing parameter values
Scenario description
Sample parameter usage
Requested bot action
Both w/l and score parameters are empty
|w/l= |score=
Remove w/l parameter usage
w/l value is empty, and score value is dash (or en dash, hyphen)
|w/l= |score=- (using minus sign)
|w/l= |score=– (using en dash)
w/l is either W or L, and score contains dash-separated numbers
With over 7k transclusions, it sounds like this would fall under
User:PrimeBOT/Task 30. Won't have time to start it until probably next week, but that has the benefit of allowing for discussion here about any issues with the above, and/or implementation.
Primefac (
talk)
12:48, 11 June 2020 (UTC)
Primefac, I added new case when score uses HTML-based code – instead of "–" (for example, |score=90–100). Sorry for late notice. I just observed only today when PrimeBot made some edits, as this case where not covered on the initial 25 amendments the other day. –
McVahl (
talk)
06:27, 21 June 2020 (UTC)
Populate tracking category for CS1|2 cite templates missing "}}"
Example. Missing "}}" is a not too uncommon problem. They can't be tracked by CS1|2 itself because the template is never invoked. I would caution attempting an automated fix because when "}}" doesn't exist there are often other structural problems, and there might be embedded templates etc.. --
GreenC15:29, 18 June 2020 (UTC)
Another problem with automated fixes is that edits that result in unclosed templates often need to be reverted entirely. –
Jonesey95 (
talk)
16:03, 18 June 2020 (UTC)
@
GreenC: Where would you envisage this putting categorisation markup? Tracking categories are a
MediaWiki internal feature that would have to be added by a MediaWiki extension, not a bot. A bot could add a hidden category to the wikitext of the page, it could add articles with issues to a list, or it could tag articles with a maintenance template, like {{Invalid citation template}} perhaps - which could then categorise the page.
Naypta ☺ |
✉ talk page |
16:19, 18 June 2020 (UTC)
That is a good point. I think your idea for {{Invalid citation template}} (or universal {{Invalid template}} or {{
malformed template}}) is great because it could be visible in the wikitext, produce a red warning message, allow for a tracking cat, and have argument options for the bot name and date, plus whatever future requirements. --
GreenC17:05, 18 June 2020 (UTC)
The problem here is going to be that, because there's no invocation of the template, it's tricky to find an appropriate set of pages to check over, without doing some god-awful regex search and crashing Elasticsearch in the process
One way to do it might be to implement an edit filter that finds matching regexes and tags them, but I'm not sure if that'd necessarily be the best way. Any thoughts, ideas or suggestions are appreciated!
Naypta ☺ |
✉ talk page |
09:16, 19 June 2020 (UTC)
For other bots, I have a system on Toolforge that downloads a list of all 6 million article titles then goes through the list, when done recreates the list and starts over. It's brute force but effective and not as terrible as it sounds when running on Toolforge since the Wikipedia servers are on the local LAN. Another possibility is generate a backlink list for the CS1|2 templates and only target those which would reduce it down to a few million. I have a unix command-line tool that does both these (generate the full list, or backlink list) if you want to use it, on git. --
GreenC13:48, 19 June 2020 (UTC)
@
GreenC: Going through a backlink list would be easy enough through the API, and there's already a list of all article titles in the DB dumps that are stored on drives accessible through Toolforge anyhow. My concern had been that doing that a) introduces quite a fair bit of server load, and b) seems like it would take about five hundred years to complete - have you found it works better?
Naypta ☺ |
✉ talk page |
14:30, 19 June 2020 (UTC)
Generally speaking in a shared environment like this slowing things down is the nicer way as it doesn't cause a spike in demand. Downloading every article sequentially would be like a 15-30k steady stream which is a blip on a gigabit LAN. And CPU/memory to regex a single article is nothing. It's about as low as one can get resource wise, while running a SQL query across 6 million can cause a resource spike but it's hidden from view. My guess is 10-15 days to complete 6 million articles based on previous experience. I have processes doing this continually so do other bots. If there was a way to regex the target articles with Elasticsearch could try that but I suspect ES will bail on the query if too complex (it limits 10,000 results but should not be a problem here). --
GreenC15:11, 19 June 2020 (UTC)
About 106 pages in article and template space contain wikilinks that begin with w:en:, which is redundant, and
VPT consensus was that this extra code can interfere with various tools and scripts that expect links to be in a certain form. Would it be possible for an AWB-wielding editor to go through and remove those prefixes, at least in article space? The edits in template space would need manual inspection to see if they are intentional for some reason. Pinging @
Redrose64,
Trialpears,
Xaosflux,
Johnuniq, and
BrownHairedGirl:, who attended that VPT discussion. –
Jonesey95 (
talk)
03:55, 1 May 2020 (UTC)
I have completed a first pass, in these 87 edits
[10].
In that run I turned off genfixes, so that I could focus clearly on this precise issue. Some of the links fixed were of the form [[w:en:Foo|Bar]], which has now been changed to [[Foo|Bar]]. That's fine ... however, many of the links were of the form [[w:en:Foo|Foo]], and that first run has left them as [[Foo|Foo]], which needs to be consolidated as [[Foo]]. So I will do a second run through the set, just applying genfixes. --
BrownHairedGirl(talk) • (
contribs)
05:08, 1 May 2020 (UTC)
Second pass complete, in these 70 edits
[11]. A further 17 pages needed no genfixes at all, so were skipped. Note that some of the pages had only genfixes unrelated to the first pass.
Thanks, all! I did not search other namespaces initially, but there are apparently 100,000+ instances across all namespaces. Many are in user signatures and other things that should not be modified, but detail-oriented editors may find links worth changing in some namespaces. –
Jonesey95 (
talk)
13:27, 1 May 2020 (UTC)
@
Headbomb: Before doing this, would it be reasonable to ask if the template source could be tweaked to display the right info even when the parameter is incorrect?
GoingBatty (
talk)
04:12, 23 April 2020 (UTC)
A bot or script taking on this task would somehow have to account for the edge case where a single page number contains a valid hyphen, like p=3-1, for a document where page 1 of part 3 is called "3-1". –
Jonesey95 (
talk)
05:00, 23 April 2020 (UTC)
Those have IMO, acceptable false positives rates (after all this type of stuff is part of AWB genfixes, and no one is calling for heads to roll), and that's why the standard is to explicitly set |page=3{{hyphen}}1 in those cases in CS1/CS2 templates. But if that's somehow not an acceptable solution here, the bot could take care of the rest. Or assume that |p=p. 3-4 should be converted to |p=3-4 and not |pp=3–4. Headbomb {
t ·
c ·
p ·
b}15:21, 2 May 2020 (UTC)
Updating DANFS links to ship articles
While working on a stub recently, I noticed the US Navy's Naval History and Heritage Command has updated the syntax of links to entries in the important reference
Dictionary of American Naval Fighting Ships. This means that many outside links to the dictionary and tools like
Template:DANFS (which is transcluded on hundreds if not thousands of US Navy ship articles) now have incorrect html targets. Here are three examples of repairs I've performed personally:
[12],
[13],
[14]. As those examples reveal, the new webpage structure isn't complicated and while I suppose I could go through all the articles by hand and rapidly improve my edit count, this is exactly the sort of thing that an automated performer of edits would be best to solve. I've never before requested a bot, so I'm asking meekly for advice.
BusterD (
talk)
15:54, 2 May 2020 (UTC)
Thanks for the speedy reply. As I mouseover the links I created in my request, I see
the site is now secure "https" not "http"
after the page address www.history.navy.mil/ they've added a new location for the entire collection "research/histories/ship-histories/"
the new addresses all end in .html not .htm
In addition, they've changed the reference structure so that the page link no longer directs to a sub-page, for example in the USS Minnesota example, the old link referenced the 11th "m" page, rendered as "m11". The new link just uses "m".
The first three are simply direct replacement edits (copy and paste), the fourth one requires the deletion of ANY digit or digits directly following the only letter in that sector of the address. Does that make sense? I'm certain my use of terminology is inexpert.
BusterD (
talk)
16:15, 2 May 2020 (UTC)
I've converted you list to numbers just for my own ease of use. I've got a few other projects I'm working on, but I'll take a look if and when I can.
Thanks for the refactoring. I suspected the number of entries must be large. Testing the success of the first few attempts would be a simple matter. Thanks for any help you can offer. Perhaps there are some MilHist or WPShips people who'd do this, but as opposed to starting a talkpage discussion, I just thought I'd request automated help. I'd be glad to monitor or help in any way necessary. Coding isn't my thing.
BusterD (
talk)
16:26, 2 May 2020 (UTC)
The major part of the change isn't recent - see
[15] and the discussion there was that the changes weren't completely consistent so couldn't easily be done automatically.
Nigel Ish (
talk)
20:41, 2 May 2020 (UTC)
Many of my criticisms in that old discussion stand but a few of the historical documents have come back. Some of those are not in the original report format preserving all context but in new transcribed form. Meanwhile the Vandals (Homeland Security with no interest in "Service history"?) sacked the USCG Historian's site with the old cutter histories "burned" and instead of a good index a pile of "stuff" one has to click through in hopes of finding what was once well organized. (Fingers crossed Army holds the anti Vandal defense line!) One has to realize providing excellent historical libraries for the public (that paid for everything) is not high on the mission priority or budget list and contracting out has eliminated subject matter expert librarians from intimate involvement and oversight regarding on line collections.
Palmeira (
talk)
15:04, 3 May 2020 (UTC)
Checks if article in
Category:Album_infoboxes_lacking_a_cover is not about a single (as many singles don't have album artwork, so only looking at EPs/albums/mixtapes only would streamline
Using
Last.fm's album.getinfo request, and obtains the "small" artwork to abide to the size guidelines regarding uploading album artwork to wikipedia.
Uploading said cover to the wiki, and editing it into the Album infobox
I looked in the rejected ideas and bots, and it seems like none really tried to attack this. My programming knowledge is okay at best, but I couldn't get any of the Java frameworks working so I'm out of luck doing this myself. ⠀TOMÁSTOMÁSTOMÁS⠀
TALK⠀00:49, 22 June 2020 (UTC)
Strongest possible oppose to any automation that adds non-free content to the project.
WP:NFCC criteria 8 has to be decided on a case-by-case basis, and while there are some that believe that there is an inherent justification for using a non-free image in the infobox of a media work, that is not what the NFCC says. No offense to the proposer themselves, but this is a dangerous idea that goes against
the third pillar, and would set a dangerous precedent if allowed.
The Squirrel Conspiracy (
talk)
00:00, 23 June 2020 (UTC)
@
The Squirrel Conspiracy: Thanks for the response. I understand, but one question just for my clarification more than anything. Wouldn't criteria 8 be applicable to any specific album page though? Wouldn't the addition of artwork in album articles "significantly increase readers' understanding of the article topic"? I wouldn't have think of a case where an album artwork doesn't do that (unless it's a soundtrack for a film or TV show). As well, to @
Redrose64:'s point, since the inherent format of album articles gives consistency and thus consistency in reasons to use Non-Free Content, wouldn't boilerplated text be applicable since what is generally true for one similarly formatted article carry on? Again, don't mean to be contrarian here or anything, but I just want to genuinely better familiarize myself with the policy. ⠀TOMÁSTOMÁSTOMÁS⠀
TALK⠀15:28, 23 June 2020 (UTC)
I am the wrong person to ask that question to. I'm of the opinion that unless the album cover itself is the subject of non-trivial coverage in cited prose, it doesn't meet NFCC #8, and that the community's consensus that creative works get a piece of non-free content in the infobox for free for identification purposes is a terrible mistake.
The Squirrel Conspiracy (
talk)
18:37, 23 June 2020 (UTC)
OpposeWP:NFCCP#10c requires ... a separate, specific non-free use rationale for each use of the item, as explained at
Wikipedia:Non-free use rationale guideline. The rationale is presented in clear, plain language and is relevant to each use. This is not possible for a bot to do except by means of boilerplated text, and that would imply that little or no thought has been put into the wording of the FUR. --
Redrose64 🌹 (
talk)
11:21, 23 June 2020 (UTC)
Not a good task for a bot. for the repeated reasons listed above. Yes a bot could provide links to an editor to facilitate the process of processing a NFCC import/justification, but a user script could as well.
Hasteur (
talk)
17:44, 23 June 2020 (UTC)
wp:SQLREQ COVID 19 data compiler
Could a bot run a SQL query or similar to compile COVID 19 data into a editable data sheet that another/same bot could import to Wikipedia COVID 19 pandemic update map/graph — Preceding
unsigned comment added by
80.41.138.48 (
talk)
16:24, 19 June 2020 (UTC)
Hi, what database would the bot be querying, and can you elaborate on what you had in mind by an 'editable data sheet'? Also, who would edit this sheet prior to the bot importing it into the map/graph?
Pi(Talk to me!)06:03, 20 June 2020 (UTC)
Challenge bot
I hope that someone can help me by making a bot add the template that articles has been added to
Wikipedia:WikiProject Europe/The 10,000 Challenge for example. There are several Challenge pages and there are templates to be added at the articles talk pages that the articles has been added to the Challenge project page, but the bot has stopped to do the task for a long time. Please ping me if this can be done.
BabbaQ (
talk)
17:17, 27 April 2020 (UTC)
Coding... I did a quick sample/proof of concept of going in and reviewing paged for eligibility, here's a random sampling of pages that appear to be eligible. Adding the template to the talk page is easy compared to unwinding the list.
@
BabbaQ: Was there a consensus discussion about applying {{WPEUR10k}} to these talk pages? I suspect this isn't contraversial, but it might be needed when I go to file the BRFA.
Hasteur (
talk)
19:45, 7 June 2020 (UTC)
@
Hasteur: - I did the request based on this being uncontroversial. A few years back the template was added to all new articles joining the projects. And I was surprise to notice that was not done anymore.
BabbaQ (
talk)
08:11, 8 June 2020 (UTC)
I'm a little backlogged at the moment, but will try to get the worldometers dataset working asap. The first link uses the same dataset that WugBot does, and an interim solution would be to write a
Lua module that reads the on-wiki CSV files and writes a wikitable. —
Wug·a·po·des05:12, 8 April 2020 (UTC)
@
Wugapodes: I did a little work on this myself, and found that there’s an additional complication: the GitHub dataset updates only daily, while the actual interactive website updates every few hours. I tried fooling around with some web-scrapers that support JavaScript, but ran into a lot of problems, probably due to my very small amount of programming experience. Perhaps you can find a working solution?
Sam1370 (
talk)
10:04, 8 April 2020 (UTC)
Updating more than once a day is likely unnecessary. The source data for each administrative unit doesn't really update more than once a day anyway, the website just shows the data as it comes in and the GitHub export combines it into a batch update. --
AntiCompositeNumber (
talk)
23:15, 8 April 2020 (UTC)
However, it is likely that even if this bot is implemented which updates it once per day, there are still going to be people who, in the interest of providing the most up-to-date information, will manually edit in the correct numbers, and bringing us back to where we started. I think that we should try to keep the info as accurate and recent as possible. I have contacted JHU on their email about this subject, asking him to either make the GitHub update along with the site or provide an easy way for a bot to get the most up to date data, but have received no response so far.
Sam1370 (
talk)
06:32, 9 April 2020 (UTC)
Any potential problems caused by manual changes may be resolved by the bot building the page instead of amending it, just as
Legobot (
talk·contribs) does with the RfC listings. For example, go to
WP:RFC/BIO and alter it in any way you like - move the requests around, delete some, add others. Then wait for the next bot run (1 min past the hour) and see what happens. --
Redrose64 🌹 (
talk)
07:58, 9 April 2020 (UTC)
However, do we really want to sacrifice accuracy for automation? Personally I would rather have manual, but the most accurate, case readings instead of automated, but slightly inaccurate, readings. As for the bot building the page, that just seems weird to me — removing helpful edits in favor of outdated data? I think we should either find a way to deliver the information right along with the JHU site, or leave it to be updated manually.
Sam1370 (
talk)
09:35, 9 April 2020 (UTC)
Oh come on, the JHU data isn't freely licensed and they're actively claiming copyright over it (which has no basis in US law). Copying it to Commons would not be a great idea in that case, unless the Commons community has decided to ignore their claims. --
AntiCompositeNumber (
talk)
23:30, 8 April 2020 (UTC)
@
Izno: I've been in touch with WMF Legal regarding this specific bot task and the response from
Jrogers (WMF) was "I don't see any reason for the Foundation to remove these templates or any of the map pages linked from them". Johns Hopkins can claim copyright only on the specific presentation and selection of the data, not the data itself (which is public domain) per
Feist v. Rural: "Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement". The data on the wiki have a different presentation and selection of data and therefore represent a valid use of the public domain component of the Johns Hopkins dataset, so I see no need to stop the bot task nor does WMF's senior legal counsel see a reason to remove its output. —
Wug·a·po·des03:14, 9 April 2020 (UTC)
The data's not acceptable on Commons because Commons cares about source country and US copyright. However, enwiki only cares about US copyright law, which doesn't recognize any copyrightable authorship in data like this. --
AntiCompositeNumber (
talk)
13:28, 9 April 2020 (UTC)
Heather Houser (May 5, 2020).
"The Covid-19 'Infowhelm'". The New York Review of Books. Covid-19 is undoubtedly testing our public health, medical, and economic systems. But it's also testing our ability to process so much frightening and imminently consequential data. All these data add up to the Covid-19 "infowhelm," the term I use to describe the phenomenon of being overwhelmed by a constant flow of sometimes conflicting information. --
GreenC16:56, 6 May 2020 (UTC)
WikiProject United States files on Commons
There are thousands of file talk pages in
Category:File-Class United States articles for files that were moved to Commons and deleted in 2011 or 2012. These talk pages contain no content except a transclusion of {{WikiProject United States}} (or one of its redirects) and should have been deleted long ago. These transclusions are of no use to the WikiProject and should be removed; however, simply removing them would leave these talk pages blank and mislead a viewer seeing a blue link into thinking there is something there. More broadly, there is no reason for these Commons files to be project-tagged on en.wikipedia—local talk pages for Commons files generally lead to split discussions or invite occasional comments that no one sees or answers.
I don't think it's this clear cut. Even if the files are on Commons, they do appear and are used on enWikipedia and I can see reasons to tag them for a WikiProject. The misplaced comments are an actual problem but I don't think their occurrence has any correlation with the presence of a WikiProject template.
Jo-Jo Eumerus (
talk)
08:23, 20 April 2020 (UTC)
Jo-Jo Eumerus, you may be right in general, but this WikiProject does not have such reasons. Certainly, the fact that
CSD G8 exempts talk pages for files that exist on Commons suggests it would be wrong to assume that no WikiProject could tag files on Commons (although that is my preference). However, I am not looking to take on that broader issue right now, and my focus is just on WikiProject United States, which does not need these pages to be tagged. --
Black Falcon(
talk)16:32, 26 April 2020 (UTC)
I think I'm going to go with Needs
wider discussion.WP:PROJSCOPE is pretty clear that if the members of a wikiproject agree that a page is outside of their scope, it should not be tagged. However, anything related to mass deletion requires strong consensus to implement, which I do not see here. Under
WP:G8, simply being a talk page for a Commons file is not a sufficient reason to delete, and this task isn't clearly covered by the text of G6. According to
this query, there are over 11,000 file talk pages with only one revision and a {{WikiProject United States}} tag. (I unfortunately can't reliably filter for "has more than one template" without doing wikitext parsing. However, most of the WP:USA file tags appear to have been added in single-project AWB runs, so the total number is likely to be fairly close. Any bot that would implement this task would need to parse the wikitext to ensure that the WP:USA tag is the only page content.) Bot tagging 11,000 pages for deletion is also not exactly polite, so this task would be best implemented by an adminbot that can just do the deletions (which again, requires demonstrated strong community approval). --
AntiCompositeNumber (
talk)
16:25, 22 April 2020 (UTC)
AntiCompositeNumber, thank you for responding. The challenge is that there are two issues which are intermingled here: (1) removing {{WikiProject United States}} from talk pages of files on Commons; and (2) mass-deleting the resulting empty talk pages. (1) is the WikiProject's decision and does not require a wider discussion. I was hoping that (2) would be uncontroversial housekeeping (
CSD G6), but am willing to seek a wider discussion if that is not the case. --
Black Falcon(
talk)16:32, 26 April 2020 (UTC)
@
AntiCompositeNumber: Making an edit with the sole purpose of bringing the page within the scope of a speedy deletion criterion is not acceptable behaviour for a human or a bot. You will need explicit consensus that these pages should be deleted.
Thryduulf (
talk)
09:57, 19 May 2020 (UTC)
Thryduulf, that
misses the point. The edits would not be for the sole purpose of speedily deleting the pages; instead, they would be for the purpose of removing an unneeded project banner. Deletion would be incidental to the pages becoming blank, and I am just trying to save time by skipping an intermediate step. However, in light of the hesitation expressed above, I will seek a wider discussion related to this request. --
Black Falcon(
talk)16:35, 25 May 2020 (UTC)
Referencias
Very easy typo task (no proper rights to do it myself): == Referencias == -> == References == and ==Referencias== -> == References == --
Emptywords (
talk)
09:02, 20 July 2020 (UTC)
I checked this category which is for so-called validSVG files tagged with {{Valid SVG}} however I noticed that many in fact were invalid. I would like a bot to check all files in the category to see if they are in fact valid or if the files are mistagged. Steps:
Check if file is valid at http://validator.w3.org/check?uri=http:{{urlencode:{{filepath:{{#titleparts:{{PAGENAME}}}}}}}}, if yes ignore, if no see 2.
@
JJMC89: I don't think using another validator should be a problem, as long as they both output the same amount of errors. If that's not the case, I think the valid/invalid SVG templates should be updated with the new validator as well.
Jonteemil (
talk)
20:56, 29 May 2020 (UTC)
@
JJMC89 and
Jonteemil: Both validators shared the same number of warnings/errors for a few files I put through them, which makes sense, because, well, they're following the same spec to validate off. That being said, whilst
there is a nice, easy API to use for the nu validator, it's still possible to use the old validator just by parsing the HTML it outputs - although that'd be slower to run and a bit more of a pain.
Naypta ☺ |
✉ talk page |
21:06, 29 May 2020 (UTC)
Can someone create a bot that will look at the latest date of the maps when the maps are updated and update the date automatically? I tried putting in the TODAY template, but I got reverted by
Boing! said Zebedee that it would not work. I was hoping someone could work on a bot to save editors' time updating the dates on the maps.
Interstellarity (
talk)
19:45, 26 May 2020 (UTC)
No, the "as of" dates should be updated only when the actual data is updated, not any time the map file is updated (which could be for many reasons). Do we update the "as of" if someone adjusts the colour of a map? No. Do we update it if someone modifies a geographical border? No. We would only do it when a map is updated to reflect new data - and I can't think of how that could be done other than manually. Incidentally, I reverted your use of TODAY as it's obviously wrong for every map to say it's up to date as of today.
Boing! said Zebedee (
talk)
19:52, 26 May 2020 (UTC)
This seems like a doable task. I'm not sure if it's for a bot so much as a template, though. I imagine that it would work similarly to {{
Cases in the COVID-19 pandemic|date}}, fetching a value that would be stored at the Commons file and updated by the map updater whenever they upload a new version. As an aside, thank you, Interstellarity, for all the work you've put in updating map date captions; I recognize it's a tedious task. {{u|Sdkb}}talk19:59, 26 May 2020 (UTC)
Some kind of template like that might work, but whoever updates the map would still have to update the data field at the Commons file manually - it couldn't just use the upload date as the "as of" date.
Boing! said Zebedee (
talk)
20:03, 26 May 2020 (UTC)
While we're here, if anyone wants to work on a bot for updating maps themselves, that's something that ought to be done at some point, but I imagine it'll be a much more complex task. Still, we have the data stored in templates already, so it'd just need to be mapped onto the various maps. It could help with some of the
standardization we've been discussing at the WikiProject. {{u|Sdkb}}talk20:05, 26 May 2020 (UTC)
@
Sdkb and
Raphaël Dunant: It looks like you two might be talking about different things - either that or I'm misunderstanding one or both of you. Raphaël, it sounds like what you want is basically just an AWB run for pages that contain the map to replace the associated date when appropriate; Sdkb, it sounds like what you're after is software that constructs the actual map.Both of these things are eminently possible; the world map SVG is such that making a bot to update the colours from a dataset given a scale ought to be trivial. That being said, if that bot was wanting to update the Commons file, it would need to be a Commons bot, not a bot on enwiki. Let me know if I've got what you're both looking for wrong though!
Naypta ☺ |
✉ talk page |
16:06, 28 May 2020 (UTC)
This bot request is about updating dates automatically, not about the map colour. But it would be nice to adapt the map bot for COVID-19 map, @
Sdkb: if you can open a discussion about this subject, I'll happily participate. @
Naypta: if you could explain how to automatically update dates, I'll be delighted!
Raphaël Dunant (
talk)
16:42, 28 May 2020 (UTC)
@
Raphaël Dunant: So one option here might be using the {{Wikidata}} template to pull in a record from Wikidata. That would mean that you could replace each iteration of the date with {{wikidata|qualifier|Q81068910|P1846|P585}}, which produces - and you'd then only have to update the single point on Wikidata qualifier on Wikidata (
wikidata:Q81068910#P1846) for it to update on all wikis. I've had a chat with a couple of admins about this and the general consensus is that it's okay to do performance-wise, but be careful with how you use this - using the wikidata template in this way can be taxing on the server, so try and use it the fewest amount of times you can!If you're happy with that method, I can run through and update the relevant bits on enwiki - you'll know better than I will where the bits are on the other wikis.
Naypta ☺ |
✉ talk page |
18:26, 28 May 2020 (UTC)
@
Raphaël Dunant: Well, this would be a way of doing it without a bot. The {{wd}} template pulls directly from Wikidata, so there's no need for a bot to update the page wikitext then. Assuming that the other wikis also have a similar template for Wikidata, which I think most do, they'd be able to use the same code. I will just ping in here the creator of the template,
Thayts - do you think it'd be okay to use this method on high traffic pages in this way? The general consensus I've had seems to be "yes", but I've not spoken to anyone directly involved in the Wd module.
Naypta ☺ |
✉ talk page |
13:52, 30 May 2020 (UTC)
Awesome!
Raphaël Dunant, if you're happy with this solution, I can get it working on the relevant pages on enwiki at least. I can also have a crack at the other language wikis -
it's clear that the template is available on the other wikis too, so this kind of a centralised approach should work. The only problem might come in terms of needing to purge the page caches when the Wikidata item changes - but that should happen when any part of the page changes anyway, and can be done manually if need be.
Naypta ☺ |
✉ talk page |
16:34, 31 May 2020 (UTC)
@
Naypta: Thank you very much for the solution! I applied it to the English Wikipedia pages. It would be amazing if you can apply it
here and on other Wikipedias, as I am not quite sure on how to apply the template to Commons and other languages. Thanks again, I hope this solution works.
Raphaël Dunant (
talk)
19:00, 31 May 2020 (UTC)
@
Raphaël Dunant: Doing... - just FYI, to make it compatible for inclusion on Commons and on some other language Wikipedias, I've changed the Wikidata page it links into. It's now
wikidata:Q95963597 - so when updating the date, update it on the P585 "point in time" property there, and it'll update everywhere else automatically.
Naypta ☺ |
✉ talk page |
20:02, 31 May 2020 (UTC)
@
Naypta: The solution works well for most pages, thanks! However, it does not automatically update
this page, which is problematic (maybe because the date is updated only when there is a page update?). Do you have any solution to make it update this page as well?
Raphaël Dunant (
talk)
22:01, 1 June 2020 (UTC)
@
Raphaël Dunant: Sure thing. So the cache expires on the sooner of the next edit, a manual purge being requested, or seven days from the last cache time. I've manually purged the cache of that page, and you can see it's now updated, but
you can also purge it at this link whenever you like. You may wish to do so after updating the Wikidata item - just click that link and then click "yes", it'll automatically update the date :)
Naypta ☺ |
✉ talk page |
22:16, 1 June 2020 (UTC)
Convert comma separated values into List
Comma separated values like A, B, C can be instead converted into
A
B
C
or
{{hlist|A|B|C}}
This is usually found in infoboxes. Additionally, values separated by a
@
Naypta: I always try to turn the data carried by the infoboxes into a more structured form (I know it'll never get there completely). It would make it easier to export data from infoboxes into
WikiData. I'mFeistyIncognito20:01, 16 June 2020 (UTC)
This is a context-sensitive task. To give just one example, {{hlist}}, because it uses <div>...</div> tags, cannot be wrapped by any tags or templates that use <span>...</span> tags, like {{nowrap}}. If an infobox wraps a parameter with {{nowrap}}, converting that parameter's contents to use {{hlist}} will lead to invalid HTML output. –
Jonesey95 (
talk)
22:17, 14 June 2020 (UTC)
Compile a list of which Wikiproject covers what article
Send each user[1] and each WikiProject a personalized report about which articles they created have errors in them, e.g.
= = List of your created articles that are in [[:Category:Harv and Sfn no-target errors]] = =
A few articles you created are in need of some reference cleanup. Basically, some short references create via {{tl|sfn}} and {{tl|harvnb}} and similar templates have missing full citations or have some other problems. This is ''usually'' caused by copy-pasting a short reference from another article without adding the full reference, or because a full reference is not making use of citation templates like {{tl|cite book}} (see [[Help:CS1]]) or {{tl|citation}} (see [[Help:CS2]]). See [[Category:Harv and Sfn template errors#Resolving errors|how to resolve issues]]. To easily see which citation is in need of cleanup, you can check '''[[:Category:Harv and Sfn template errors#Displaying error messages|these instructions]]''' to enable error messages ('''Svick's script''' is the simplest to use, but '''Trappist the monk's script''' is a bit more refined if you're interested in doing deeper cleanup).
The following articles could use some of your attention
{{columns-list|colwidth=30em|
#[[Ancient 1]]
#[[Article 2]]
...
}}
If you could add the full references to those article, that would be great. Again, the easiest way to deal with those is to install Svick's script per [[:Category:Harv and Sfn template errors#Displaying error messages|these instructions]]. If after installing the script, you do not see an error, that means it was either taken care of, or was a false positive, and you don't need to do anything else.
Also note that the use of {{para|ref|harv}} is no longer needed to generate anchors. ~~~~
^Skip user talk pages with links to List of your created articles that are in [[:Category:Harv and Sfn no-target errors]] in headers since they already have such a report
I think the message needs to provide a link to a discussion page where people can go for help. Keep in mind that most requests for help will be of the form "What is this message? I didn't do anything or ask for this. I don't understand it. Help me." –
Jonesey95 (
talk)
23:31, 18 May 2020 (UTC)
Maybe
Template talk:Sfn? If this goes through, I'd like to see these messages go out in batches, in case a potential help system (run by you and me, presumably) gets a lot of traffic. –
Jonesey95 (
talk)
01:44, 19 May 2020 (UTC)
Doesn't really matter much to me where things go.
Module talk:Footnotes could be a place. Messages could be sent in batches too. Maybe top 25 users, then next 25, and so on each day for the first week. And then see what the traffic is and adjust rates if it's nothing crazy. Headbomb {
t ·
c ·
p ·
b}11:28, 19 May 2020 (UTC)
I have added both categories to the WikiProject Cleanup Listings. They will appear in the next run on June 23. --
Bamyers99 (
talk)
01:24, 17 June 2020 (UTC)
I want a bot to do all of my editing.It is hard to do editing.It may help with deleting pages if you want to.Having a bot also puts less stress on editing.
Was an explorer —Preceding
undated comment added 14:21, 4 August 2020
I don't think you really understand what bots are used for. They can assist with editing for tedious and/or repetitive tasks, but they won't "do your editing".
Primefac (
talk)
15:48, 4 August 2020 (UTC)
Copy coordinates from lists to articles
Virtually every one of the 3000-ish places listed in the 132 sub-lists of
National Register of Historic Places listings in Virginia has an article, and with very few exceptions, both lists and articles have coordinates for every place, but the source database has lots of errors, so I've gone through all the lists and manually corrected the coords. As a result, the lists are a lot more accurate, but because I haven't had time to fix the articles, tons of them (probably over 2000) now have coordinates that differ between article and list. For example, the article about the
John Miley Maphis House says that its location is 38°50′20″N78°35′55″W / 38.83889°N 78.59861°W / 38.83889; -78.59861, but the manually corrected coords on the list are 38°50′21″N78°35′52″W / 38.83917°N 78.59778°W / 38.83917; -78.59778. Like most of the affected places, the Maphis House has coords that differ only a small bit, but (1) ideally there should be no difference at all, and (2) some places have big differences, and either we should fix everything, or we'll have to have a rather pointless discussion of which errors are too little to fix.
Therefore, I'm looking for someone to write a bot to copy coords from each place's NRHP list to the coordinates section of {{infobox NRHP}} in each place's article. A few points to consider:
Some places span county lines (e.g. bridges over border streams), and in many of these cases, each list has separate coordinates to ensure that the marked location is in that list's county. For an extreme example,
Skyline Drive, a long scenic road, is in eight counties, and all eight lists have different coordinates. The bot should ignore anything on the duplicates list; this is included in citation #4 of
National Register of Historic Places listings in Virginia, but I can supply a raw list to save you the effort of distilling a list of sites to ignore.
Some places have no coordinates in either the list or the article (mostly archaeological sites for which location information is restricted), and the bot should ignore those articles.
Some places have coordinates only in the list or only in the article's {{Infobox NRHP}} (for a variety of reasons), but not in both. Instead of replacing information with blanks or blanks with information, the bot should log these articles for human review.
Some places might not have {{infobox NRHP}}, or in some cases (e.g.
Newport News Middle Ground Light) it's embedded in another infobox, and the other infobox has the coordinates. If {{infobox NRHP}} is missing, the bot should log these articles for human review, while embedded-and-coordinates-elsewhere is covered by the previous bullet.
I don't know if this is the case in Virginia, but in some states we have a few pages that cover more than one NRHP-listed place (e.g.
Zaleski Mound Group in Ohio, which covers three articles); if the bot produced a list of all the pages it edits, a human could go through the list, find any entries with multiple appearances, and check them for fixes.
Finally, if a list entry has no article at all, don't bother logging it. We can use WP:NRHPPROGRESS to find what lists have redlinked entries.
I've copied this request from an archive three years ago; an off-topic discussion happened, but no bot operators offered any opinions. Neither then nor now has any discussion has yet been conducted for this idea; it's just something I've thought of. I've come here basically just to see if someone's willing to try this route, and if someone says "I think I can help", I'll start the discussion at
WT:NRHP and be able to say that someone's happy to help us. Of course, I wouldn't ask you actually to do any coding or other work until after consensus is reached at WT:NRHP.
Nyttend (
talk)
15:53, 12 February 2020 (UTC)
You could use {{Template parameter value}} to pull the coordinate values out of the {{NRHP row}} template. It would still likely take a bot to do the swap but it would mean less updating in the future. Of course, if the values are 100% accurate on the lists then I suppose it wouldn't be necessary.
Primefac (
talk)
16:55, 12 February 2020 (UTC)
Never heard of that template before. It sounds like an Excel =whatever function, e.g. in cell L4 you type =B4 so that L4 displays whatever's in B4; is that right? If so, I don't think it would be useful unless it were immediately followed by whatever's analogous to Excel's "Paste Values". Is that what you mean by having a bot doing the swap? Since there are 3000+ entries, I'm sure there are a few errors somewhere, but I trust they're over 99% accurate.
Nyttend (
talk)
02:57, 13 February 2020 (UTC)
That's a reasonable analogy, actually. Check out the source of
Normani#Awards_and_nominations: it pulls the wins and nominations values from the infobox at the "
list of awards", which means the main article doesn't need to be updated every time the list is changed.
As far as what the bot would do, it would take one value of {{coord}} and replace it with a call to {{Template parameter value}}, pointing in the direction of the "more accurate" data. If the data is changed in the future, it would mean not having to update both pages.
Now, if the data you've compiled is (more or less) accurate and of the not-likely-to-change variety (I guess I wouldn't expect a monument to move locations) then this is a silly suggestion – since there wouldn't be a need for automatic syncing – and we might as well just have a bot do some copy/pasting.
Primefac (
talk)
21:27, 14 February 2020 (UTC)
Primefac, thank you for the explanation. The idea sounds wonderful for situations like the list of awards, but yes these are rather accurate and unlikely to change (imagine someone picking up
File:Berry Hill near Orange.jpg and moving it off site), so the bot copy/paste job is probably best.
Nyttend (
talk)
02:23, 15 February 2020 (UTC)
Thank you for helping me understand. "as it currently stands" Is there something wrong with it, i.e. if changes were made you'd be offering, or do you simply mean that you have other interests (
WP:VOLUNTEER) and don't feel like getting involved in this one? This question might sound like I'm being petty; I'm writing with a smile and not trying to complain at all.
Nyttend (
talk)
00:27, 21 February 2020 (UTC)
Actually not. A not-so-small fraction of articles need to have different coordinates in lists and infoboxes, as I already noted here. If we consistently rely on the lists to inform Wikidata, it's going to end up with a good number of self-contradictions due to lists that appropriately don't provide coordinates that make sense in articles (e.g. multi-county listings). Moreover, you can't rely on the infoboxes to inform Wikidata, because there's a consistently unacceptable error rate in coordinates unchecked by humans, and very few infoboxes are checked by humans; they're derived from the National Register database, and it would be pointless to ignore or trash the human-corrected Virginia coordinates. Literally all that needs to be done is a bot doing some copy/pasting; it would greatly be appreciated if someone were to spend a few minutes on this, instead of passing the buck.
Nyttend backup (
talk)
19:36, 28 April 2020 (UTC)
A bot to develop a mass of short stubs and poorly built articles for Brazilian municipalities
I propose a bot along the lines of {{Brazil municipality}} is created to develop our stubs like
Jacaré dos Homens which have been lying around for up to 14 years in some cases. There's 5570 municipality articles, mostly poorly developed or inconsistent with data and formatting even within different states. A bot would bring much needed information and consistency to the articles and leave them in a half decent state for the time being,
Igaci which
Aymatth2 expanded is an example of what is planned and would happen to stubs like Jacaré dos Homens. Some municipalities have infoboxes and some information but hopefully this bot will iron out the current inconsistencies and dramatically improve the average article quality. It would be far too tedious to do it manually, would take years, and they've already been like this for up to 14 years! So support on this would be appreciated.†
Encyclopædius12:09, 20 May 2020 (UTC)
@
Encyclopædius: and @
Aymatth2: Where's the community endorsed consensus from WikiProject Brazil/WikiProject Latin America/Village Pump? Where's your driver list of proposed articles? How are you proposing to improve the page so that these aren't perma stubs with no chance at improvement? Per
WP:FAIT and
WP:MASSCREATION it's expected that there will be a very large and well attended consensus that this bulk creation is appropriate. In short, Not a good task for a bot. table this until you have a community conesnsus in hand as very few bot operators will roll the dice on doing this task in exchange for having their bot revoked.
Hasteur (
talk)
17:40, 23 June 2020 (UTC)
@
Hasteur: The title of this proposal is a bit misleading. The idea is not to create a mass of short stubs and poorly built articles, but to improve the existing mass of short stubs and poorly built articles. There are 5,570 of them, all notable based on
WP:NGEO. The
Brazilian Institute of Geography and Statistics (IBGE) maintains a database with extensive information on the geography, population, economy etc. of each of them. See
Cocalinho and
Brasil / Mato Grosso / Cocalinho for a sample IBGE entry. Using this information, and information from sources like
GeoNames and
Mindat.org, we can upgrade a stub like
Araguaiana into a more useful article like
Cruzeiro do Sul, Paraná. This seem uncontroversial. The proposal is to develop a screen-scraping tool that will make it easier to copy the data into each Brazil municipality stub.
There are quite a lot of these database-type websites on different topics, displaying each entry in highly standardized format. There is no copyright concern as long as we stick to dates, numbers etc. It would probably be very difficult to develop a generic screen-scraper that could be configured to handle them all, but might be possible to develop reusable logic that could make it fairly simple to develop a new one. That seems to be worth discussing.
Aymatth2 (
talk)
19:02, 23 June 2020 (UTC)
Not a good task for a bot. Absolutely NOT. FULL STOP. Get a broadly endorsed consensus at Village Pump as there have been several cases (NRHP, Betacommand, etc) automated database dumps that have gotten editors drummed out either in part (restrictions on creation) or full on community/Arbitration banned. While you may think this is uncontraversial, this is requires a well attended RFC to confirm the sense of the community.
Hasteur (
talk)
20:58, 23 June 2020 (UTC)
Greetings. At
WP:DYKN, the image size is based on the orientation of the image; vertical images at 120px, square at 140, and horizontal at 160. However there is no way to set the resolution during nomination, which means that even experienced editors often forget to fix the size of the image, and new editors don't know that they should.
I am proposing that a bot do a daily check and update the resolution where needed. In order to cut down on the amount of resources required, it needs only look at recent additions.
It would, I'm guessing, work something like this:
Generate a list of all DYK nominations added to
Template talk:Did you know since the task was last run. (It can't use the nomination date because there's a 7-day window to nominate.)
Since nominations can be reviewed quite quickly and moved to the
Template talk:Did you know/Approved page, the bot would need to check there as well. While the main Nominations page has a "Current nominations" section comprising of the current date and the previous seven days—this is updated at midnight every day—the Approved page doesn't have the equivalent section. Depending on how often it runs, the bot may need to check earlier on the page, because the dates are not when the nomination was added, but rather when work on the article began, which is supposed to be no more than seven days before nominating. (But is sometimes a little late.)
BlueMoonset (
talk)
01:44, 7 June 2020 (UTC)
I wonder if it's possible to do this with a module? I'm not familiar with them, but a quick glance shows file metadata can provide height and width.
[16] If it is possible to do with a module, that'd probably be better, and it would update automatically rather than having to wait for periodic bot runs.
ProcrastinatingReader (
talk)
02:34, 14 July 2020 (UTC)
in articles that transclude
Template:Infobox drug. There are a few rare variations that I can remove by hand or that require manual decision whether to remove, but this seems to be the vast majority and a conservative regex for it. This is a one-time cleanup pass that I started doing it with
WP:JWB before I realized it was possibly the majority of the 12K articles in that transcluders list.
DMacks (
talk)
19:18, 17 June 2020 (UTC)
Am I correct in that the parameter itself has not been deprecated, just the usage where a value and units are given?
Primefac (
talk)
19:38, 17 June 2020 (UTC)
Mostly-correct. The units should not be given with the number...that's a mistake that needs to be fixed. The majority of cases, even the number does not need to be given (it's a deprecated use-case of the field, not the field deprecated as a whole). One detail I had in my offline note and forgot to paste (yikes! sorry!) is to limit the scope to pages where there is a:
/\| *C *= *\d/
as those are pages where the value can be automatically calculated, so the field is not needed. In terms of regex, this is almost always on the line immediately preceding the /molecular_weight/ if it would be useful to have a single regex rather than "one regex to filter the pages, another to replace". Rather than simply fixing the units across-the-board, this is an opportunity to upgrade the usage wherever easily possible. There are a bunch of special cases, where the field contains other than a single number or where the number really does need to be manually specified, but I'm setting those aside for now...once the majority of mindless fixes are done, individual decisions about each remaining case can be made.
DMacks (
talk)
00:37, 18 June 2020 (UTC)
Might be useful to set up some tracking categories, then; those that don't need the param, and those that need the units removed.
Primefac (
talk)
00:40, 18 June 2020 (UTC)
...has stabilized around 5600 pages. Next step is to filter the ones whose field is malformed (mistake to fix) rather than just redundant (deprecated but valid format).
DMacks (
talk)
14:40, 19 June 2020 (UTC)
Deferred I'm JWB'ing it, with a looser regex and manual oversight...manually annoying but still scratches the itch.
DMacks (
talk)
13:43, 21 June 2020 (UTC)
Bypassing redirects for hatnotes and see also sections
This task might be better for semi-automated editing than a straight bot, but I'll throw it out here. I often come across hatnotes and see also sections that link to an old title for a page, e.g.
this sort of fix or
this one. Would it be possible to create a bot or a tool that lists or fixes instances where hatnotes or see also sections include a redirect to a page that has been moved to a new title? {{u|Sdkb}}talk05:30, 7 July 2020 (UTC)
Normally, if a page title is not broken, the page move won't succeed. If it does succeed, it's likely there's a good enough reason that it'd be worth changing the see also links and hatnotes as well. {{u|Sdkb}}talk01:13, 10 July 2020 (UTC)
Oppose There can be reason to use a correct but alternate name in a hatnote because it is shorter, such as the example Primefac gave above. (
t ·
c) buidhe10:38, 10 July 2020 (UTC)