The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at
Wikipedia:Bots/Noticeboard. The result of the discussion was Denied.
Function details: We have generated a dataset containing minor to larger spelling mistakes using Wikipedia dump based on heuristics like edit distance, popularity, frequency of use. We manually verify about 10-15 such changes everyday on our own designed portal at
https://www.iitg.ac.in/cseweb/WikiFeedback. This bot will reflect those changes from our SQL server to Wikipedia bot.
Discussion
Hi there. Interesting project. For the purposes of approval, I'm afraid it would not be responsible for us to approve a closed-source bot, that also requires registration to use, and could violate
WP:CONTEXTBOT unless all the edits are supervised. But it appears you might be affiliated with an
Indian Institutes of Technology (judging by the domain of your site) so I'm not going to decline this BRFA yet, as your project may be promising, but the function will need tweaking to meet our standards for bots. If we did approve a bot like this we need to have a high level of confidence that it isn't going to make problematic changes (see
WP:CONTEXTBOT). You do mention that you'll manually verify 10-15 a day. The bot could be coded as a tool using OAuth to make the edits on a user account directly (see
WP:BOTMULTIOP). A bot account isn't usually needed to make a tool like this. All the edits would have to be manually verified by a human before going live on an article to make sure that they're okay.
ProcrastinatingReader (
talk)
14:05, 28 April 2021 (UTC)reply
Hello
ProcrastinatingReader, Yes I am a research student at an
Indian Institutes of Technology. Sorry, I guess it wasn't clear from my previous writing that we have a high confidence dataset which will also be voted upon by many users from our research group on the Portal I mentioned above. Based on User voting (expected 10-15 votes per day per user), our heuristics and recent revisions on similar pages on Wikipedia we would want our bot to make changes, keep tracking if some human user reverts back the changes made and notify us as well as our voters about revisions. I couldn't mention complete heuristics of generating the dataset since it is a live project. Also since all changes are being verified manually, I believe chances of problematic changes is quite low. I can make source code available only after permissions from our group, if its non-availability is a deal-breaker. Please do let me what can be done to get this bot live.
K.Kapil77 (
talk)
18:29, 28 April 2021 (UTC)reply
Source code is not required by default to be published, but it's encouraged, and for some tasks it can be preferable to have others look over it.
You can code your algorithm to generate diffs for a change however you like of course (eg using user voting or analysis of revisions). The key for the purposes of this bot is that you need a human to look over the change and make sure it's okay before it actually goes live on an article. If you're looking to move into unsupervised territory, this could maybe be re-evaluated if there's a track record of generated changes being accurate, but in the first instance I don't think a bot could be approved for unsupervised editing in this manner.
They have already said all edits are manually vetted, so I don't think CONTEXTBOT is applicable. There exist a lot of bots whose edits are manually supervised. Making it an OAuth tool is added complication which is useful only when the creators want to crowdsource vetting of the edits (here they indicate they'll do the vetting themselves). So all in all I don't see any issues here. Having a quick trial will likely help further review. –
SD0001 (
talk)
12:42, 29 April 2021 (UTC)reply
Hmm. I guess if these conditions (all edits manually verified by the same editor) will not change in the future it won't actively be a problem. @
SD0001: do you want to take it through trial?
ProcrastinatingReader (
talk)
12:52, 29 April 2021 (UTC)reply
Approved for trial (25 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please don't mark edits as minor for the trial (so that they show up in watchlists and
RecentChanges). Let us know if you run into any issues. –
SD0001 (
talk)
13:07, 29 April 2021 (UTC)reply
It's not at all clear which are the edits the bot would make and which are your personal edits, which is why bot trials should be done via the bot account. I see a lot of errors like these:
Also, the piping removed in edits like
[4],
[5],
[6],
[7] don't look appropriate for a bot to do. It seems the original pipings were editorial decisions to make the display text different from the article name, which may or may not be necessary – but that's not for a bot to decide. –
SD0001 (
talk)
13:21, 25 May 2021 (UTC)reply
All our edits by Bot are monitored and reflected on the page only after manual confirmation. (Please read Bot Description above). It is as good as an spell error identified by bot and approved by human. The type of edits our Bot makes is :
[8] - Spell Correction ,
[9] - Proper Reference,
[10] - Correcting Improper Reference. These Edits are based on proper heuristics and don't generate a mistake , if not completely resolving. The piping removal you mentioned maybe just because the the original Entity Page is actually the best Display name rather than the one using the Pipe, also manually verified. I don't see any issue with Bot's performance as such.
K.Kapil77 Bot (
talk)
15:38, 25 May 2021 (UTC)reply
Please only make unambiguous spell corrections. Changing
Varkari to
Warkari isn't that, and even falls afoul of
WP:DONOTFIXIT since it's not a misspelling – Asian names/words can have different acceptable romanisations. Regarding the 2nd edit ("Proper Reference") it changed [[IIT Bombay|Indian Institute of Technology]] (IIT Bombay) to [[IIT Bombay|Indian Institute of Technology Bombay]] (IIT Bombay) on an article that's on Bombay so it's obvious which IIT we're talking about.
[11] added a link where one didn't exist before and wasn't necessary (see
WP:OVERLINK). We can't let a bot-flagged account make edits like this. –
SD0001 (
talk)
13:11, 26 May 2021 (UTC)reply
Alright! I’ll restrict these types of changes and push only spell corrections like typos. I’ll update the new edits via Bot account and update here by tomorrow. 13:54, 26 May 2021 (UTC)
K.Kapil77 Bot (
talk)
13:54, 26 May 2021 (UTC)reply
No.
[12] changes a
MOS:DATERANGE-compliant date to a non-compliant form. I don't have the know-how to tell whether
[13],
[14] are net-positives; they do look like, but surely a bot going around making such edits is going to generate controversy when errors arise. These are more appropriately done via a human account. I don't see any edits here that are fixing spelling mistakes in words that aren't proper nouns. –
SD0001 (
talk)
15:14, 10 June 2021 (UTC)reply
The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at
Wikipedia:Bots/Noticeboard.